Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify Tag Models [FC-0030] #87

Merged
merged 13 commits into from
Oct 3, 2023
Merged

Conversation

bradenmacdonald
Copy link
Contributor

@bradenmacdonald bradenmacdonald commented Sep 29, 2023

This is my work on openedx/modular-learning#85 , "clean up tagging models".

I have a few other cleanups in mind but I'm trying to keep this PR focused on one idea: Taxonomies are not aware of Objects. The Taxonomy cannot specify what types of objects it applies to, nor which ObjectTag subclass it works with. Instead that can be specified at a higher level in the API.

So the changes include:

  • Removed Taxonomy.object_tag_class - taxonomies no longer specify what type of objects they can tag. You can specify it when you call api.tag_object() though if you want to.
  • Removed ModelObjectTag - ObjectTag should only be subclassed to tag different types of objects, not to use different types of tags. It was confusing that we used ContentObjectTag when we want to tag content objects, but ModelObjectTag when we want to use Tags that are defined from another model.
  • Moved tag_object and autocomplete_tags from Taxonomy to be python functions in the API. These were not actually being subclassed anyways so don't need to be instance methods.
  • Removed tag_ref - all tags are now set by value. This makes things simpler and more consistent. (Values may change, it's true, but at any given point in time a value is a unique ID within a given taxonomy, and values are what users see. Tag IDs are still used internally and exposed via the API, but not used when setting tags. This lets us remove tag_ref and logic that was in ObjectTag is moved to more appropriate places.)
  • The language taxonomy creates tags as needed. We don't actually need the fixture anymore, though I haven't decided yet if it makes sense to delete it or not.
  • The language taxonomy allows en-uk etc. if it's defined in Django settings
  • I found ObjectTag.is_valid to be confusing. So I put validation into Django's clean methods and I converted is_valid to is_deleted. What before was is_valid is now not is_deleted
  • I removed the logic to "automagically" re-link an ObjectTag to the most similar taxonomy when its linked taxonomy is deleted. I'm worried this is too magical and could create inappropriate links to another org's taxonomy. Also, the new UX designs clearly allow updating taxonomies in-place, so it's unlikely this will need to happen often. If it does, someone can manually fix the taxonomy IDs using a script or the django admin, to make sure the explicitly specify the right taxonomy.

Private-ref: FAL-3477

TODO In Future PRs?:

@openedx-webhooks
Copy link

openedx-webhooks commented Sep 29, 2023

Thanks for the pull request, @bradenmacdonald! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Sep 29, 2023
@bradenmacdonald bradenmacdonald added the FC Relates to an Axim Funded Contribution project label Sep 29, 2023
@bradenmacdonald bradenmacdonald force-pushed the braden/simplify-tag-models branch 3 times, most recently from 929e71e to 157e71f Compare September 29, 2023 19:22
@bradenmacdonald bradenmacdonald force-pushed the braden/simplify-tag-models branch from 157e71f to 54c01a5 Compare September 29, 2023 19:27
@@ -137,7 +138,7 @@ class Taxonomy(models.Model):
),
)
allow_multiple = models.BooleanField(
default=False,
default=False, # TODO: This should be true, or perhaps remove this property altogether
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV @pomegranited Do you know if we actually need this allow_multiple flag? I don't see any requirement for it in the UX mockups etc.

Same question about required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need allow_multiple, but required can be removed.
E.g. we only want one Language tag per content item, but we'll want to allow multiple LightCast tags to be added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @pomegranited 👍

default=None,
on_delete=models.SET_NULL,
help_text=_(
"Tag associated with this object tag. Provides the tag's 'value' if set."
),
)
_name = case_insensitive_char_field(
null=False,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null=False is the default for char fields, so no need to specify it here.

@bradenmacdonald bradenmacdonald changed the title Simplify Tag Models [WIP] [FC-0030] Simplify Tag Models [FC-0030] Sep 29, 2023
# ("1", "tag_id", True), # Valid
# ("0", "tag_id", False), # Invalid user
# ("test_id", "tag_id", False), # Invalid user id
# ("1", None, False), # Testing parent validations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald This case refers more to call the super() functions to call the parent class validations. Since that logic has been removed, I think it is okay to delete this test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Yeah for now I deleted it but put in some other new related tests.

@@ -207,37 +252,25 @@ class TestLanguageTaxonomy(TestTagTaxonomyMixin, TestCase):
Test for Language taxonomy
"""

# FIXME: something is wrong with this test case. It's setting the string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@ChrisChV ChrisChV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald Looks good to me 👍 I left some comments about the commented tests

  • I tested this runing the test.
  • I read through the code.
  • Includes documentation

Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great, @bradenmacdonald !
I left a few inline nits and minor questions, but overall it's so much better than what we had before. I'm happy for you to merge + tag the new version once these are addressed.

👍

@@ -1296,3 +1296,4 @@
allow_multiple: false
allow_free_text: false
visible_to_authors: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language taxonomy creates tags as needed. We don't actually need the fixture anymore, though I haven't decided yet if it makes sense to delete it or not.

I agree that we don't need the fixture to create the _tags_that are found in the language taxonomy -- so we could delete lines 1-1288 here.

However, we still need lines 1289+ in this fixture to create the Language taxonomy itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry I wasn't clear. We can delete the languages, but we still need to create the taxonomy. However, if it's just creating a single object, I would prefer to move that to be a data migration instead of a fixture, to keep things simple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha.. data migration is fine with me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we remove the fixture, we can also remove openedx_tagging/core/tagging/management/commands/.

Comment on lines 536 to 546
def clean(self):
if self.tag:
if self.tag.taxonomy_id != self.taxonomy_id:
raise ValidationError("ObjectTag's Taxonomy does not match Tag taxonomy")
elif self.tag.value != self._value:
raise ValidationError("ObjectTag's _value is out of sync with Tag.value")
else:
# Note: self.taxonomy and/or self.tag may be NULL which is OK, because it means the Tag/Taxonomy
# was deleted, but we still preserve this _value here in case the Taxonomy or Tag get re-created in future.
if self._value == "":
raise ValidationError("Invalid _value - empty string")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This + is_deleted and the validate_<field> methods is a much clearer way to deal with ObjectTag validation, thank you! So nice to see how this flowed into the system taxonomies too.

Couple of points:

  1. Do we also need to check that self._name == self.taxonomy.name here? I'm pretty sure Taxonomy names can be changed via the UI too.
  2. This method needs a docstring -- and I'm puzzled why pylint didn't pick that up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, it turns out that pylint wasn't running properly because openedx_tagging/core/__init__.py didn't exist. I'll have to fix that in a future PR because there are too many lint issues showing up now. But I've fixed all the ones related to this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man! Yes, better to fix that in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald

I'll have to fix that in a future PR because there are too many lint issues showing up now.

Do you want me to submit a PR for this? I've got time, and there's another couple of minor things that should be fixed:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pomegranited Yes please, that would be great! Thank you :)

openedx_tagging/core/tagging/models/base.py Outdated Show resolved Hide resolved
self.taxonomy = None
self.tag = None
# We used to have code here that would try to find a new taxonomy if the current taxonomy has been deleted.
# But for now that's removed, as it risks things like linking a tag to the wrong org's taxonomy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Check if 'external_id' is part of this Taxonomy.
"""
lang_code = external_id.lower()
LANGUAGE_DICT = getattr(settings, "LANGUAGE_DICT", dict(settings.LANGUAGES)) # setting may or may not exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I guess settings.LANGUAGE_DICT is an old name for settings.LANGUAGES? Didn't know that.

But if we need to check both of them, then don't we need to check them both here and here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close... Actually settings.LANGUAGES is the normal setting, but it's a list of tuples. For convenience/performance, the cms/lms also define LANGUAGE_DICT = dict(LANGUAGES) which converts it into a dict.

I was already on the fence about whether to use LANGUAGE_DICT or not, but your comment shows to me that it's certainly not clear as it is. So I will add some comments in to clarify before merging.



class UserModelObjectTag(ModelObjectTag):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not subclassing ObjectTag for system-defined taxonomies anymore (yay!), the System Taxonomy creation ADR should be updated to describe the use of Taxonomy subclasses instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I updated it. I'm going to merge now but feel free to leave comments on this diff if you have any, and I'll address them in future PRs.

@bradenmacdonald bradenmacdonald force-pushed the braden/simplify-tag-models branch from a4937b7 to ecff923 Compare October 3, 2023 21:27
@bradenmacdonald bradenmacdonald merged commit 5de0969 into main Oct 3, 2023
@bradenmacdonald bradenmacdonald deleted the braden/simplify-tag-models branch October 3, 2023 21:36
@openedx-webhooks
Copy link

@bradenmacdonald 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

@bradenmacdonald
Copy link
Contributor Author

@ChrisChV @pomegranited Thank you for the very nice and timely reviews!

@pomegranited
Copy link
Contributor

@bradenmacdonald

TODO: Make Tag.lineage do one query only

I'm not sure this can be done without either a) using a proper tree-aware model, or b) using recursive CTE which requires MySQL 8. So we mitigated the performance risk by limiting the maximum tree depth to 3.

Also, the use case for Tag.get_lineage() has changed since the UI designs came in, so it might be better to return a full serializable list of Tags instead of just their values? Not sure, but I think that'll depend on how the frontend and search backend handle the implicit parent tags.

@bradenmacdonald
Copy link
Contributor Author

@pomegranited You can see what I'm thinking in #92 . Any thoughts welcome, especially if you think it may not be an improvement in some way. I'm unsure about the performance because I'm doing a lot of JOINs in that version. Once @yusuf-musleh finishes the test data sets, I want to test all of these on some very large taxonomies and see how they perform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants