Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracted Logic from UserService into RepoUrlValidator #496

Closed
wants to merge 12 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/main/java/edu/byu/cs/util/RepoUrlValidator.java
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@ public class RepoUrlValidator {

public static boolean isValidRepoUrl(String url) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming the static method from RepoUrlValidator.clean() to RepoUrlValidator.cleanRepoUrl() introduces redundancy and duplication in the naming. Additional justification should be provided to defend this as a good change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll push back only against the idea that "additional justification" is needed here.

It's just cause clean() and cleanRepoUrl() were merged, and that happened to be the one that was kept. If we think clean() is a bit more concise, then sure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just cause clean() and cleanRepoUrl() were merged, and that happened to be the one that was kept.

The problem here is that the two functions named above were merged when they were never intended to be merged. They were intentionally written as separate methods, and a little more effort should be undertaken to understand why they were written separately before zealously merging them together.

Consider the following original call signatures:

  • static boolean isValid(String)
  • static String clean(String) throws InvalidRepoUrlException
  • String clean(String) throws InvalidRepoUrlException

Can you think of any reasons they would be written/exposed as separate call signatures? What additional reasons would you find if you searched for the usages of the functions in the app?

Here are several reasons for the individual and separate existence of the call signature named above:

  • String clean(String) throws ...: This is the non-static function that actually does all of the work. Since this class is intended to be used almost entirely via the static methods, but this method is not static, this is functionally equivalent to a protected function. Most clients of the class will never see or care about this function, but it is provided in the public API so that advanced users could access or override it if desired. This is the main entry point into the working logic.
    • Notice that, in order to provide sufficient context for all the different places this code is used, the base implementation throws errors when the string is invalid, and returns a string when it is valid.
    • This gives future developers maximum freedom to reimplement the guts of this function while still satisfying the original call signature. For example, the current implementation is hard-coded to only accept GitHub.com URLs. It is very feasible that in the future, users of this code would want to allow URLs from other websites (like cs.byu.edu or bitbucket.com etc...). Different parsing rules could be inserted, or the direct parsing could be entirely replaced with a more generic rule validator, all without changing the call signature.
    • Notice that none of the app currently uses this call signature directly at the present time.
  • static String clean(String) throws ...: This static method is a convenience wrapper over the internal function that makes it available via static import.
    • Why don't we just write the function in the static namespace? Because the implementation of the code may want to update state as instance variables of the class (as one example). Consistent with the discussion in Convert Tesing Utilities to Static Imports #495, we are not avoiding static imports entirely, but leveraging the best of both worlds. On the one hand, creating an instance variable allows us to have separate state for each instance along with other potential benefits; and on the other hand, exposing a public static method on the API makes it clean and simple for clients to use.
    • As a static method, it will be invoked with the full name of the class.
    • This is exposed as a Façade over the real implementation. As such, this function can consider the perspective and needs of the calling clients. Therefore, the name is written to reduce duplication and provide a clear behavior when called as RepoUrlValidator.clean(String).
  • static boolean isValid(String): This is provided as net another convenience function over the real implementation.
    • Notice that this function does not throw an error! Instead, this function adapts the error call and alway returns a boolean True/False value to simply indicate the existence of a valid Repo URL.
    • This is used in multiple places in the app to test if the url is valid without needing to see the actual cleaned URL. This operation is performed by the tests and at a few decision points in the algorithm.
    • This saves duplication from multiple clients where they would need to catch the error and convert it into a boolean value manually.
    • If RepoUrl checking were an expensive operation, this function could be implemented differently to provide a lightweight check for validity without actually going through the expensive operation. However, since this is not the case for us, the implementation simply delegates to the root delegation and then adapts the response.

File cloningDir = new File("./tmp" + UUID.randomUUID());
boolean valid = true;
try {
GitHelper.fetchRepo(cloningDir, url);
Copy link
Contributor

@frozenfrank frozenfrank Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this new comment, this implementation is now particularly dangerous because RepoUrlValidator.isValid() and RepoUrlValidator.clean() are based on different conditions. This will cause problems, and if it's not already causing problems, it's because I didn't write tricky enough tests (that's my bad).

The isValid() function and the clean() function are intended to go together. They need to always work together or fail together. If they are evaluating different conditions (isValid() performs a check that the repo clones properly, but clean() skips that check and merely evaluates the string according to predefined string parsing rules, then the app will misbehave very quickly.

Allow me to present a few test cases that would break the system based on the current setup, let alone future changes:

Repo URL isValid() clean() Comment
https://github.com/frozenfrank/chess Baseline. Valid URL.
https://bitbucket.com/frozenfrank/chess Clones successfully, but does not match the current string rules.
https://github.com/not-a-student/chess Does not exist, but does match our string rules
https://github.com/19mdavenport/cs240chessStarter In the future, we hope to reject this URL because it is a fork of the base repo.

The original edition of the code has the problem as well, but this PR, if accepted, would make the problem even worse. Previously, a student could potentially submit the URL into the system based on one set of rules, and then have it fail during evaluation due to a different set of rules performing the work. With these changes, the difference in the code is not partially changed within the autograding system. Some calls would work and some would fail all within the same system which are assumed and intended to be consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, this change still does not address the original issue at hand: that code which directly implements a git clone is duplicated in the system and should be replaced with a call to the existing implementation located in GitHelper.java #463 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already is using the now extracted method of GitHelper.
And, for the 4 cases that you provided, I'm confused on which ones are failing that shouldn't or passing when they should fail? Your examples follow what should already happen, no? Because the current chain of code calls clean and validifies it (updateRepoUrl calls clean and setRepoUrl, which calls isValidRepoUrl (although I could rename it to isValid))

} catch (GradingException e) {
FileUtils.removeDirectory(cloningDir);
TheDavSmasher marked this conversation as resolved.
Show resolved Hide resolved
return false;
valid = false;
}
FileUtils.removeDirectory(cloningDir);
return true;
return valid;
}

/**
Expand Down
Loading