Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added design doc for sparse checkout #335

Merged
merged 17 commits into from
Aug 27, 2024

Conversation

officialasishkumar
Copy link
Contributor

1. Does this PR affect any open issues?(Y/N) and add issue references (e.g. "fix #123", "re #123".):

2. What is the scope of this PR (e.g. component or file name):

3. Provide a description of the PR(e.g. more details, effects, motivations or doc link):

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Other

4. Are there any breaking changes?(Y/N) and describe the breaking changes(e.g. more details, motivations or doc link):

  • N
  • Y

5. Are there test cases for these changes?(Y/N) select and add more details, references or doc links:

  • Unit test
  • Integration test
  • Benchmark (add benchmark stats below)
  • Manual test (add detailed scripts or steps below)
  • Other

@officialasishkumar
Copy link
Contributor Author

PTAL

cc: @Peefy @zong-zhe

@coveralls
Copy link

coveralls commented May 24, 2024

Pull Request Test Coverage Report for Build 10505658929

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 1292 unchanged lines in 14 files lost coverage.
  • Overall coverage decreased (-3.7%) to 40.57%

Files with Coverage Reduction New Missed Lines %
pkg/cmd/cmd_update.go 20 0.0%
pkg/opt/opt.go 29 42.47%
pkg/api/kpm_pkg.go 29 80.0%
pkg/package/toml.go 34 78.79%
pkg/settings/settings.go 39 57.36%
pkg/cmd/cmd_add.go 45 0.0%
pkg/downloader/downloader.go 55 55.41%
pkg/mvs/mvs.go 62 55.88%
pkg/git/git.go 71 52.79%
pkg/package/package.go 83 26.11%
Totals Coverage Status
Change from base Build 9394460463: -3.7%
Covered Lines: 3186
Relevant Lines: 7853

💛 - Coveralls

@Peefy
Copy link
Contributor

Peefy commented May 28, 2024

Hello @officialasishkumar

I think we only need to set a subdirectory for the Git repository, as it may contain multiple kcl modules, while OCI, Local source is not required.

@officialasishkumar officialasishkumar force-pushed the sparse-checkout-design-doc branch from e467457 to a52c590 Compare May 28, 2024 02:31
@officialasishkumar
Copy link
Contributor Author

officialasishkumar commented May 28, 2024

Hello @officialasishkumar

I think we only need to set a subdirectory for the Git repository, as it may contain multiple kcl modules, while OCI, Local source is not required.

The current design doc addresses this.

@Peefy
Copy link
Contributor

Peefy commented May 28, 2024

The current design doc addresses this.

I noticed that the design document has not changed, it still supports setting sub folders for local source and oci source, which is not necessary.

@officialasishkumar
Copy link
Contributor Author

@Peefy Apologies for not being specific. I have updated the PR. Please let me know if any more changes are needed.

kpm add --subdir <subdir> <git-repo-url>
```

The `--subdir` flag will be optional. If the flag is not provided, `kpm` will download the entire repository as it does now. If the flag is provided, `kpm` will download only the specified subdirectory. The `kcl.mod` file will be generated with the path to the subdirectory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kcl.mod file will be generated with the path to the subdirectory.

Here, I personally think that if there is no kcl.mod file in the subdirectory downloaded by the user, we should output diagnostic information and notify the user, rather than generate the file for him, because the kcl.mod file will affect the compilation result in some cases, if the user does not know that kpm have generated the file for him, it may cause trouble to his use.

Example usage:

```
kpm add --subdir 1.21/* k8s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I recommend that this function should be operated on a valid KCL package and not extended to various files and file directories, so I do not recommend adding support such as * or a *.k file.

--subdir xxx/xxx # only a subdir here, and the subdir must contains `kcl.mod` file, if not, raise an error.

In the future, if there is a demand for *, we can consider the research again to decide whether to support it. It is too early to introduce this feature in the path details.


```
[dependencies]
bbb = { path = "../bbb", subdir = ["test-*", "test-*"]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the workflow for kpm when it get the kcl.mod like this, could you please add more details ?

The main point that confuse me is:

The main purpose of this work was to support the addition of dependencies from subdirectories of git repo.

bbb = { path = "../bbb", subdir = ["test-", "test-"]}

It doesn't seem to have anything to do with git


As mentioned in the [go-getter](https://pkg.go.dev/github.com/hashicorp/go-getter#readme-subdirectories) docs, we can append our subDir from `CloneOptions` (only if subDir is not empty) in `WithRepoURL` function.

### using go-git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kpm used to use go-git to interact with git repo, but one issue that forced me to switch to go-getters. Let me give you a few more details that may be helpful.

When supporting taking git repo as dependencies, there is one tricky issue to consider: authentication. go-gitand the native git client do not share the same authentication, which means that if you use go-git, you need to consider the authentication in kpm and implement it. It's a lot of work.

And the authentication part of go-git itself is not yet complete: go-git/go-git#490

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. So I will be using go-getter methods.

@zong-zhe
Copy link
Contributor

zong-zhe commented May 29, 2024

@officialasishkumar Sorry for the late review 😳, which is my mistake. I finished the review yesterday, but I forgot to click submit review...

research/design-doc/sparse_checkout_asishkumar.md Outdated Show resolved Hide resolved

```
[dependencies]
bbb = { git = "https://github.com/kcl-lang/flask-demo-kcl-manifests.git", commit = "ade147b", subdir = ["something"]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why subdir is a list ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined this because in the future if we want to add other subdir from a git repo say kcl-lang/modules then we would be able to achieve that without overwritting the previous subdir

bbb = { git = "https://github.com/kcl-lang/modules", commit = "ade147b", subdir = ["add-ndots"]}
```

The subdir is a list because in the future if user wants to add another subdir from the same git repo then it can be added without overwritting the current subdir.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some things I don't understand here.

then it can be added without overwritting the current subdir

If we refer to different directories in the same repository, I think the kcl.mod will looks like:

[dependencies]
bbb = { git = "https://github.com/kcl-lang/modules/add-ndots", commit = "ade147b"}
another_bbb = { git = "https://github.com/kcl-lang/modules/another-add-ndots", commit = "ade147b"}

And, the two packages are stored in two different directories.

So can you show an example to describe the problem you mentioned above?

The subdir is a list because in the future if user wants to add another subdir from the same git repo then it can be added without overwritting the current subdir.

Copy link
Contributor Author

@officialasishkumar officialasishkumar Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my proposal, I am trying to achieve this directory structure :

├── .kcl
│   ├── kpm
│   │   ├── modules
│   │   │   ├── k8s
│   │   │   ├── agent

And, the two packages are stored in two different directories.

Do you mean this directory structure?

├── .kcl
│   ├── kpm
│   │   ├──modules
│   │   │   └── k8s
│   │   ├── modules
│   │   │   ├── agent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two file trees you gave above, are they different on the file system 😳 ? They seem to be just different drawings of the same file tree.

In my proposal, I am trying to achieve this directory structure :

The directory structure is correct and I have no problem. My question is, why is there a subdir field in kcl.mod, and why is it a list. Is there a problem with some mechanism of go-getter ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two file trees you gave above, are they different on the file system?

Sorry, I misunderstood something.

why is there a subdir field in kcl.mod, and why is it a list. Is there a problem with some mechanism of go-getter ?

No, there is no problem with go-getter. I have removed subdir flag. Thanks for the suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#335 (comment)

Should I revert back my subdir/package change in kcl.mod?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @officialasishkumar

I think you can follow the cargo design for this feature. I have one question about your design: why do you want to use a list as the subdir? You should also add some details about how users can import dependencies in KCL code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed subdir. I think i will be able to parse and download the dependencies by just keeping the whole url.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @officialasishkumar 😄, I think there may be some misunderstandings between us. What needs to be determined in this document is how users can use this function properly. I don't worry about whether this function can be implemented. This feature does not involve some complex and hard parts, and there should be no obstacles to implementation. Maybe you could consider the option I mentioned earlier #335 (comment)

@zong-zhe
Copy link
Contributor

zong-zhe commented Jun 6, 2024

Hi @officialasishkumar 😄

I've been looking at the cargo's design for this feature. I have some new findings to share with you. Hope this helpful.

I find this design comfortable to use. ❤️
It is here: https://doc.rust-lang.org/stable/cargo/reference/specifying-dependencies.html#renaming-dependencies-in-cargotoml

The cargo.toml looks like this:

[package]
name = "mypackage"
version = "0.0.1"

[dependencies]
bar = { git = "https://github.com/example/project", package = "foo" }

It will clone the git repo -> recursively look for all cargo.toml in the subdirectory -> find the package named foo -> load it.

I tried to use this feature on a project and found several advantages of this design.

  • Compared to adding a subdirectory to the URL, this way the user does not need to bother typing the URL. The URL of the subdirectory in the browser is not directly usable. 😅
Screenshot 2024-06-06 at 11 13 02
  • When a user decides to use a dependency, he should probably know the package name of the dependency. 😆

@officialasishkumar officialasishkumar force-pushed the sparse-checkout-design-doc branch from e66801e to cb6d79d Compare June 6, 2024 17:13
Signed-off-by: Asish Kumar <[email protected]>
Signed-off-by: Asish Kumar <[email protected]>
@officialasishkumar
Copy link
Contributor Author

officialasishkumar commented Jul 12, 2024

@zong-zhe I have a question. Which option is better with the user perspective?
Pass the url directly or pass it using a flag option (only the package name).

@zong-zhe
Copy link
Contributor

zong-zhe commented Jul 15, 2024

.
Hi @officialasishkumar 😄

From my point of view, I think using a flag option (only the package name) is better.

@zong-zhe
Copy link
Contributor

Hi @officialasishkumar 😄

Hi, please update this doc and synchronize the results of our discussion with the final design decision. In this PR, we need to keep the design document consistent with the implementation, as both the document and the final code implementation are required.

Signed-off-by: Asish Kumar <[email protected]>
@officialasishkumar
Copy link
Contributor Author

@Peefy Thanks for the comment. I have updated accordingly. Please have a look!

@Peefy
Copy link
Contributor

Peefy commented Aug 20, 2024

@Peefy Thanks for the comment. I have updated accordingly. Please have a look!

Where updated? ...

@Peefy
Copy link
Contributor

Peefy commented Aug 20, 2024

@Peefy Thanks for the comment. I have updated accordingly. Please have a look!

Where updated? ...

I can't see it in the design doc...

@Peefy
Copy link
Contributor

Peefy commented Aug 20, 2024

@officialasishkumar

What I mean is that you need to complete the design doc and clearly display the unfinished tasks in the doc.

@officialasishkumar
Copy link
Contributor Author

officialasishkumar commented Aug 20, 2024

@officialasishkumar
Copy link
Contributor Author

officialasishkumar commented Aug 20, 2024

@officialasishkumar

What I mean is that you need to complete the design doc and clearly display the unfinished tasks in the doc.

According to discussion with zong-zhe, I don't think there is any unfinished tasks.

https://github.com/kcl-lang/kpm/pull/335/files#diff-2a349b8c20e04265bf3cb4f6c9fec5fa61f1c405844b07a94b7cc98310286805R47-R49

@Peefy
Copy link
Contributor

Peefy commented Aug 20, 2024

I don't think there is any unfinished tasks.

However, I use the kpm v0.10.0-beta.1 to test the feature, it does not work for me

kcl mod init me && cd me
kcl mod add --git https://github.com/kcl-lang/modules --tag v0.1.0 --package helloworld
kcl mod metadata
dependency 'helloworld' not found in '/Users/lingzhi/.kcl/kpm/helloworld_v0.1.0'

@Peefy
Copy link
Contributor

Peefy commented Aug 20, 2024

I don't think there is any unfinished tasks.

However, I use the kpm v0.10.0-beta.1 to test the feature, it does not work for me

kcl mod init me && cd me
kcl mod add --git https://github.com/kcl-lang/modules --tag v0.1.0 --package helloworld
kcl mod metadata
dependency 'helloworld' not found in '/Users/lingzhi/.kcl/kpm/helloworld_v0.1.0'

cc @officialasishkumar @zong-zhe

@officialasishkumar
Copy link
Contributor Author

Oh my bad, i will do the changes on those subcommand and update it by tonight.

@officialasishkumar
Copy link
Contributor Author

Just for confirmation, are only graph, metadata, update are needed to be synched right?

@Peefy
Copy link
Contributor

Peefy commented Aug 20, 2024

Just for confirmation, are only graph, metadata, update are needed to be synched right?

kcl run is also needed this.

@officialasishkumar
Copy link
Contributor Author

Just for confirmation, are only graph, metadata, update are needed to be synched right?

kcl run is also needed this.

kcl mod run is working

@officialasishkumar
Copy link
Contributor Author

Hey @Peefy, Can you please check and let me know what all things are remaining? I will start working on them at the earliest.

@Peefy
Copy link
Contributor

Peefy commented Aug 21, 2024

Hey @Peefy, Can you please check and let me know what all things are remaining? I will start working on them at the earliest.

Hello @officialasishkumar

Just merge all design PRs and feature development PR, and test them to pass including unit tests and e2e tests including kcl run, kcl mod add --package , kcl mod metadata, kcl mod graph, etc. Besides, users can update Git repository sub packages by adding it through kcl mod add or updating kcl.mod file and running kcl mod update.


The user can then run `kcl mod run` to run the code:

`kcl mod run`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/kcl mod run/kcl run

Signed-off-by: Asish Kumar <[email protected]>
@Peefy
Copy link
Contributor

Peefy commented Aug 22, 2024

@officialasishkumar @zong-zhe

The metadata of modules should be modules_v0.1.0 instead of helloworld_v0.1.0

$  cat kcl.mod
[package]
name = "me"
edition = "v0.10.0"
version = "0.0.1"

[dependencies]
helloworld = { git = "https://github.com/kcl-lang/modules", tag = "v0.1.0", package = "helloworld" }
$ kcl mod metadata
{"packages":{"helloworld":{"name":"helloworld","manifest_path":"~/.kcl/kpm/helloworld_v0.1.0/helloworld"}}}

@officialasishkumar
Copy link
Contributor Author

If i am right I have to change the fullname to the repo name?

Are there any things that needs to be updated with repo name?

@Peefy

@Peefy
Copy link
Contributor

Peefy commented Aug 22, 2024

Yes, using the repo name as the full path looks more logical, see this issue #384 for more information.

@officialasishkumar
Copy link
Contributor Author

@Peefy

should the key name will also changed to repo name?

[dependencies]
check-kcl = { git = "https://github.com/officialasishkumar/check-kcl", commit = "831fada36e155c8758f07f293c8267c869af69d3", package = "k8s" }

or

[dependencies]
k8s = { git = "https://github.com/officialasishkumar/check-kcl", commit = "831fada36e155c8758f07f293c8267c869af69d3", package = "k8s" }

@officialasishkumar
Copy link
Contributor Author

officialasishkumar commented Aug 22, 2024

The kcl.mod.lock looks like this rn after my changes. Lmk if everything looks good i will open a PR.

[dependencies]
  [dependencies.check-kcl]
    name = "check-kcl"
    full_name = "check-kcl_831fada36e155c8758f07f293c8267c869af69d3"
    version = "1.28"
    url = "https://github.com/officialasishkumar/check-kcl"
    commit = "831fada36e155c8758f07f293c8267c869af69d3"
    package = "k8s"

@officialasishkumar
Copy link
Contributor Author

and the metadata

$ /home/charon/coding/open-source/kpm/testme metadata
{"packages":{"check_kcl":{"name":"check_kcl","manifest_path":"/home/charon/.kcl/kpm/check-kcl_831fada36e155c8758f07f293c8267c869af69d3/test"}}}

@Peefy
Copy link
Contributor

Peefy commented Aug 22, 2024

The form is right. The key should be always the package name.

[dependencies]
k8s = { git = "https://github.com/officialasishkumar/check-kcl", commit = "831fada36e155c8758f07f293c8267c869af69d3", package = "k8s" }

@officialasishkumar
Copy link
Contributor Author

PTAL @zong-zhe @Peefy

Copy link
Contributor

@zong-zhe zong-zhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zong-zhe zong-zhe merged commit b8fd1da into kcl-lang:main Aug 27, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhancement: kpm add command displaying optimization
4 participants