Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added design doc for sparse checkout #335

Merged
merged 17 commits into from
Aug 27, 2024
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions research/design-doc/sparse_checkout_asishkumar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# KPM sparse checkout

**Author**: Asish Kumar

## Abstract

`kpm` manages third-party libraries through Git repositories, requiring a `kcl.mod` file at the root directory. It treats the entire Git repository as a single `kcl` package, which is inefficient for monorepos containing multiple `kcl` packages. Often, a `kcl` project depends on just one package within a monorepo, but `kpm` downloads the entire repository. Therefore, `kpm` needs to allow adding a subdirectory of a Git repository as a dependency, enabling it to download only the necessary parts and improve performance.

## User Interface
zong-zhe marked this conversation as resolved.
Show resolved Hide resolved

The user can just provide the subdir git url. An example command will look like this:

```
kcl mod add --git https://github.com/kcl-lang/modules/tree/main/argoproj --tag <tag>
```

kpm would parse the git url and extract the subdirectory path using `GetPath()` function from github.com/kubescape/go-git-url package. It will then download the subdirectory and append it in the subdir array of the `kcl.mod` file.

The `kcl.mod` file will look like this:

```
[dependencies]
bbb = { git = "https://github.com/kcl-lang/modules", commit = "ade147b", subdir = ["add-ndots"]}
```

The subdir is a list because in the future if user wants to add another subdir from the same git repo then it can be added without overwritting the current subdir.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some things I don't understand here.

then it can be added without overwritting the current subdir

If we refer to different directories in the same repository, I think the kcl.mod will looks like:

[dependencies]
bbb = { git = "https://github.com/kcl-lang/modules/add-ndots", commit = "ade147b"}
another_bbb = { git = "https://github.com/kcl-lang/modules/another-add-ndots", commit = "ade147b"}

And, the two packages are stored in two different directories.

So can you show an example to describe the problem you mentioned above?

The subdir is a list because in the future if user wants to add another subdir from the same git repo then it can be added without overwritting the current subdir.

Copy link
Contributor Author

@officialasishkumar officialasishkumar Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my proposal, I am trying to achieve this directory structure :

├── .kcl
│   ├── kpm
│   │   ├── modules
│   │   │   ├── k8s
│   │   │   ├── agent

And, the two packages are stored in two different directories.

Do you mean this directory structure?

├── .kcl
│   ├── kpm
│   │   ├──modules
│   │   │   └── k8s
│   │   ├── modules
│   │   │   ├── agent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two file trees you gave above, are they different on the file system 😳 ? They seem to be just different drawings of the same file tree.

In my proposal, I am trying to achieve this directory structure :

The directory structure is correct and I have no problem. My question is, why is there a subdir field in kcl.mod, and why is it a list. Is there a problem with some mechanism of go-getter ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two file trees you gave above, are they different on the file system?

Sorry, I misunderstood something.

why is there a subdir field in kcl.mod, and why is it a list. Is there a problem with some mechanism of go-getter ?

No, there is no problem with go-getter. I have removed subdir flag. Thanks for the suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#335 (comment)

Should I revert back my subdir/package change in kcl.mod?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @officialasishkumar

I think you can follow the cargo design for this feature. I have one question about your design: why do you want to use a list as the subdir? You should also add some details about how users can import dependencies in KCL code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed subdir. I think i will be able to parse and download the dependencies by just keeping the whole url.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @officialasishkumar 😄, I think there may be some misunderstandings between us. What needs to be determined in this document is how users can use this function properly. I don't worry about whether this function can be implemented. This feature does not involve some complex and hard parts, and there should be no obstacles to implementation. Maybe you could consider the option I mentioned earlier #335 (comment)


## Design

The path to the directory will be passed to `CloneOptions` in [pkg/git/git.go](https://github.com/kcl-lang/kpm/blob/d20b1acdc988f600c8f8465ecd9fe04225e19149/pkg/git/git.go#L19) as subDir.

### using go-getter

As mentioned in the [go-getter](https://pkg.go.dev/github.com/hashicorp/go-getter#readme-subdirectories) docs, we can append our subDir from `CloneOptions` (only if subDir is not empty) in `WithRepoURL` function.

## References

1. https://medium.com/@marcoscannabrava/git-download-a-repositorys-specific-subfolder-ceeabc6023e2
2. https://pkg.go.dev/github.com/hashicorp/go-getter
3. https://pkg.go.dev/github.com/kubescape/go-git-url
Loading