Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract git logic from UserService #463

Open
ThanGerlek opened this issue Nov 7, 2024 · 4 comments · May be fixed by #497
Open

Extract git logic from UserService #463

ThanGerlek opened this issue Nov 7, 2024 · 4 comments · May be fixed by #497
Assignees
Labels
good first issue Good for newcomers

Comments

@ThanGerlek
Copy link
Contributor

ThanGerlek commented Nov 7, 2024

Overview

The UserController directly implemented a git clone command as a temporary filler while the real functionality was being implemented by another developer. The filler behavior should be replaced with a function call to the actual behavior, and optionally, the cloning behavior merged it and preserved as a requirement.

Discussion

[UserService's isValidRepoUrl()] method seems like it doesn't belong in [that] class. Dealing with the specifics of git and cloning objects feels like it belongs in a different layer.

I would recommend moving this function into the git/ directory (similar to having a separate DAO layer from the service layer).

Originally posted by @frozenfrank in #450 (comment)

Specifics

This is the code in question:

private static boolean isValidRepoUrl(String url) {
File cloningDir = new File("./tmp" + UUID.randomUUID());
CloneCommand cloneCommand = Git.cloneRepository().setURI(url).setDirectory(cloningDir);
try (Git git = cloneCommand.call()) {
LOGGER.debug("Cloning repo to {} to check repo exists", git.getRepository().getDirectory());
} catch (GitAPIException e) {
FileUtils.removeDirectory(cloningDir);
return false;
}
FileUtils.removeDirectory(cloningDir);
return true;
}

Notice that, at a high level, this function is re-implementing the code already written in the GitHelper.java file. In fact, it was probably copy & pasted from there, and then tweaked for it's new environment.

private void fetchRepo(File intoDirectory) throws GradingException {
gradingContext.observer().update("Fetching repo...");
CloneCommand cloneCommand = Git.cloneRepository()
.setURI(gradingContext.repoUrl())
.setDirectory(intoDirectory);
try (Git git = cloneCommand.call()) {
LOGGER.info("Cloned repo to {}", git.getRepository().getDirectory());
} catch (GitAPIException e) {
throw new GradingException("Failed to clone repo: " + e.getMessage(), e);
}
}

The simplest, although incorrect, resolution to this PR would be to replace the bulk of the body of UserService:: isValidRepoUrl() with a function call to GitHelper::fetchRepo(), and adapt the response. That solution could look like this:

    private static boolean isValidRepoUrl(String url) {
        File cloningDir = new File("./tmp" + UUID.randomUUID());
        try {
            // NOTE: Would require publicly exposing the method, and
            // allowing it to be called with an arbitrary URL (it currently
            // relies on the `GradingContext` object for the URL).
            GitHelper.fetchRepo(cloningDir, repoUrl);
            return true;
        } catch (GradingException ex) {
            return false;
        } finally {
            FileUtils.removeDirectory(cloningDir);
        }
    }

However, as discussed in #463 (comment), this direct replacement of functionality is not quite sufficient. In this case, we need to replace it with new code that performs it's original intent.

The end result should really look like the following. Note that at this point, the isValidRepoUrl() function serves only as an adapter over the external RepoUrlValidator.isValid() method. It is a design choice to leave it as-is, or directly substitute RepoUrlValidator.isValid() into all (there is only one) occurrence within the UserController.java file. While there are reasons to leave it separate, in this case, it may make plenty of sense to go ahead and perform that substitution. (Note that this does not apply to the changing of the RepoUrlValidator API. That is not up for consideration here.)

    private static boolean isValidRepoUrl(String url) {
        return RepoUrlValidator.isValid(url);
    }

Scope of Changes

If this PR only made the above change, it would be just fine and acceptable, and it would address the original issue.

If the minimal required change is pursued, the following are expected to all be true:

  1. The direct git clone implementation is defined in exactly one location (GitHelper)
  2. Minimal changes are made to surrounding areas of the code
  3. No changes are made to public call signatures

However, we may be able to do better. If the developer so chooses, this issue could properly merge the two checks together so that a successfully validating a URL via the RepoUrlValidator.isValid() and RepoUrlValidator.clean() method performed the git clone command in addition to the current string parsing and in preparation for other future checks.

Since performing a git clone command dramatically changes the runtime characteristics of the function used in multiple places, a careful review would first need to be conducted to ensure that this doesn't break any existing assumptions or introduce a performance bottleneck. Specifically, the original string parsing was lightweight and fast, but a git clone performs multiple network calls and downloads the entire repository onto the machine. Some repos could be very large, or the internet could be down, and it could be that not all of the code was prepared for this. It is fine to convert this into a separate issue for development independent of this original issue.

If the merging behavior is pursued, the statements are expected to all be true, in addition to those shared earlier about the minimal required change:

  1. The isValid(), and clean() methods of RepoUrlValidator are consistent with each other. By some method such as delegation, adapting, or calling the same functions, the pair of functions produce a corresponding successful or a failure message for all possible inputs.
  2. The implementation of RepoUrlValidator follow the style of the original implementation. That is— the functionality is written as methods on an object, and a convenience static function adapts and exposes the for use external to the class.

Intention

This is intended to give a clear, complete, and detailed overview of the state of this issue as well as the expectations for a PR that hopes to resolve this issue. If there remain ambiguities or questions about this issue, please post a clarifying question as a comment.

Credit

Co-authored by @frozenfrank.

@ThanGerlek
Copy link
Contributor Author

I found the code that performs this work in GitHelper! We may have to expose the private function, but let's prefer to reuse code rather than allowing it to be duplicated.

Originally posted by @frozenfrank in #450 (comment)

@ThanGerlek
Copy link
Contributor Author

I agree it shouldn't be here. That would introduce a secondary entry point and usage context for GitHelper, though, and at present GitHelper is very nicely self-contained and cohesive. Rather than broaden the Single Responsibility of GitHelper, I propose we rename GitHelper to GitGradingHelper or something similar, then create a GitHelper class containing pure static functions that don't depend on a GradingContext. Then both UserService and GitGradingHelper can depend on it.

Thoughts @frozenfrank ?

Originally posted by @ThanGerlek in #450 (comment)

@TheDavSmasher
Copy link

I went over this issue and the code has been moved and reformatted, alongside a few code quality changes

@frozenfrank
Copy link
Contributor

While considering a resolution to this issue, it's important to recognize the history of the code being manipulated.

The code in question, which performs a git clone command was first introduced in this commit:

private static boolean isValidRepoUrl(String url) {
CloneCommand cloneCommand = Git.cloneRepository().setURI(url);
try (Git git = cloneCommand.call()) {
LOGGER.info("Test cloning git repo to {}", git.getRepository().getDirectory());
} catch (GitAPIException e) {
LOGGER.error("Failed to clone repo", e);
return false;
}
return true;
}

At this point in the development, front-end changes to allow RepoUrl changes was being developed in parallel to back-end upgrades to the repo-url validation. This code was added in the front-end as a temporary implementation until the rest of the code was ready. That partially explains that directly git clone code implementation; this code wasn't intended to be the final version.

The git clone operation was intended to be an easy and lightweight check to verify that the URL actually represents a repo Url (as opposed to some other random Url) in an automated way.

Now that we have the RepoUrlValidator class (which didn't exist at the time), it may be appropriate to skip the git clone operation and rely exactly on the logic that is implemented in the rest of the system. Calling the same function will result in the fewest surprises due to unexpected and unforeseen corner cases.

This will be even more important based based on the discoveries in softwareconstruction240/softwareconstruction#138. In the near future (before Winter 2025 semester), we want to verify that the Repo Url is not only valid, but also not a fork. If this logic is applied inconsistently across the codebase, it could easily lead to frustration for the students, or the TAs attempting to debug the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
4 participants