Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Rename key,id,dockey to docID terminology #1749

Merged
merged 35 commits into from
Jan 4, 2024

Conversation

shahzadlone
Copy link
Member

@shahzadlone shahzadlone commented Aug 2, 2023

Relevant issue(s)

BREAKING CHANGE:

  • Use of _key to access a document's unique id is now deprecated, instead use _docID.
  • Use of dockey/id is now deprecated, instead use docID.
  • Use of dockeys/ids is now deprecated, instead use docIDs.

Description

  • Rename _key to _docID everywhere.
  • Rename _keys to docIDs in explain the response.
  • Rename _newKey to _docIDNew in backup/recover functionality.
  • Fix _docID tests.
  • Fix explain and backup/recover functionality tests.
  • Fix the collectionID order for a P2P test (leaving a note as order was reverted).
  • Update all cids.
  • Rename all files with key(s)|dockey(s) in the name to doc_id(s).
  • Document breaking change to pass change detector.

For Reviewers:

  • Main commits to review are the PR(MAIN) commits.
  • If you have more time PR(MINOR) and PR(*TEST) commits are good to go over

Disclaimer / Discussion:

I do not like these non-underscored docID/docIDs, would be in favor of underscoring these :

query {
    User(docIDs: ["bae-6a6482a8-24e1-5c73-a237-ca569e41507d"]) {
        _docID
    }
}
query {
    Users {
        Name
        _docID
        _version {
            docID
        }
    }
}
Request: ` {
    commits(groupBy: [docID], order: {docID: DESC}) {
        docID
    }
}`,
Results: []map[string]any{
    {
        "docID": "bae-f54b9689-e06e-5e3a-89b3-f3aee8e64ca7",
    },
    {
        "docID": "bae-72f3dc53-1846-55d5-915c-28c4e83cc891",
    },
},

EDIT: Above was resolved with #2162, to do out of this PR.

Limitations (out of scope of this PR):

@shahzadlone shahzadlone added area/testing Related to any test or testing suite code quality Related to improving code quality deprecate Indicates something is deprecated. labels Aug 2, 2023
@shahzadlone shahzadlone self-assigned this Aug 2, 2023
@shahzadlone shahzadlone added this to the DefraDB v0.7 milestone Aug 2, 2023
api/http/errors.go Outdated Show resolved Hide resolved
@shahzadlone shahzadlone force-pushed the gql-renaming-phase1 branch from 5e19d45 to c3b1964 Compare August 3, 2023 11:27
@shahzadlone shahzadlone changed the title refac: Rename _key to _docID as unique id refac: Rename 'key' terminology to 'docID' Aug 3, 2023
@shahzadlone shahzadlone changed the title refac: Rename 'key' terminology to 'docID' refac: Rename '_key' terminology to '_docID' Aug 3, 2023
@shahzadlone shahzadlone changed the title refac: Rename '_key' terminology to '_docID' refac: Rename key terminology to docID Aug 3, 2023
@shahzadlone shahzadlone changed the title refac: Rename key terminology to docID refac: Rename _key terminology to _docID Aug 3, 2023
@shahzadlone shahzadlone changed the title refac: Rename _key terminology to _docID refactor: Rename _key to _docID terminology Aug 3, 2023
@shahzadlone shahzadlone force-pushed the gql-renaming-phase1 branch 2 times, most recently from 1283538 to f102b7d Compare September 7, 2023 05:36
@shahzadlone shahzadlone force-pushed the gql-renaming-phase1 branch 2 times, most recently from f102b7d to 33d5f6a Compare September 14, 2023 12:57
@shahzadlone shahzadlone force-pushed the gql-renaming-phase1 branch 4 times, most recently from b761bde to 6305cb5 Compare October 5, 2023 03:56
Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely going in the right direction. I added a few comments and then looked at how many other reference to dockey there was. There were hundreds so I create a branch (https://github.com/fredcarle/defradb/tree/fredcarle/chore/dockey-rename) with what I found. There might be more changes needed to cover everything.

@@ -98,7 +100,7 @@ var (
VersionFields = []string{
HeightFieldName,
CidFieldName,
DockeyFieldName,
DocID,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: I'm pretty sure this needs to be DocIDFieldName

Copy link
Member Author

@shahzadlone shahzadlone Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I apply the above suggestion of To keep things consistent, DocIDFieldName. and then do this then it would mean we change from "docID" to "_docID", which is what I like too. However @jsimnz had a hard opinion to not underscore the field names (yet atleast).

This happens because previously we used dockey/dockeys/id/ids, which now became docID/docIDs.

Note that I am against these (my eyes get cancer from them everytime), also noted in PR description:

query {
    User(docIDs: ["bae-6a6482a8-24e1-5c73-a237-ca569e41507d"]) {
        _docID
    }
}
query {
    Users {
        Name
        _docID
        _version {
            docID
        }
    }
}
Request: ` {
    commits(groupBy: [docID], order: {docID: DESC}) {
        docID
    }
}`,
Results: []map[string]any{
    {
        "docID": "bae-f54b9689-e06e-5e3a-89b3-f3aee8e64ca7",
    },
    {
        "docID": "bae-72f3dc53-1846-55d5-915c-28c4e83cc891",
    },
},

My vote is for _ version for all instances of docID(s) (and in future for every internal field :P)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have a very strong preference for all internal fields to have _ as it makes it very clear what they are.

client/request/consts.go Outdated Show resolved Hide resolved
merkle/clock/clock_test.go Outdated Show resolved Hide resolved
planner/commit.go Outdated Show resolved Hide resolved
planner/delete.go Outdated Show resolved Hide resolved
@shahzadlone
Copy link
Member Author

shahzadlone commented Dec 18, 2023

Definitely going in the right direction. I added a few comments and then looked at how many other reference to dockey there was. There were hundreds so I create a branch (fredcarle/defradb@fredcarle/chore/dockey-rename) with what I found. There might be more changes needed to cover everything.

Yes most of those are local vars, description strings, comments and test function names. Can take care of them too within this PR if there is preference

@shahzadlone shahzadlone force-pushed the gql-renaming-phase1 branch 4 times, most recently from f4ba7a8 to 853cb77 Compare December 21, 2023 06:45
@shahzadlone shahzadlone requested a review from fredcarle January 3, 2024 17:05
@shahzadlone shahzadlone changed the title refactor: Rename _key to _docID terminology refactor: Rename key(s),id(s),dockey(s) to docID(s) terminology Jan 3, 2024
@shahzadlone shahzadlone changed the title refactor: Rename key(s),id(s),dockey(s) to docID(s) terminology refactor: Rename key,id,dockey to docID terminology Jan 3, 2024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: @AndrewSisley is the order change here expected?

Copy link
Contributor

@AndrewSisley AndrewSisley Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not off the top of my head, I can't remember what order the results we returned in - there is a chance it is sorted by version id though

@AndrewSisley
Copy link
Contributor

AndrewSisley commented Jan 3, 2024

thought: I am finding this really hard to effectively review. There are a lot of independent changes here, and reviewing them all at once with them blended into each other is quite tricky. This is made harder by the nature of the changes (renaming).

The commits do not help reviewers as much as they could, as they are not what I would consider complete - where a commit begins a change (e.g. Rename _key to _dockey), but leaves the code in a broken state to be patched by later commits (e.g. Apply naming within unit tests). This means as a reviewer I cannot effectively review the rename of _key to _dockey unless I review a guessed range of commits (and hope nothing else is interleaved within that range).

In the future I would suggest either breaking up the PR (I understand why you may want it in one here). Or broadening the scope of the commits so that reviewers can review a single, complete, build-passes change in a single commit.

I'm not yet asking for you to change this in this PR, but thought I'd leave the feedback.

@shahzadlone
Copy link
Member Author

shahzadlone commented Jan 3, 2024

thought: I am finding this really hard to effectively review. There are a lot of independent changes here, and reviewing them all at once with them blended into each other is quite tricky. This is made harder by the nature of the changes (renaming).

The commits do not help reviewers as much as they could, as they are not what I would consider complete - where a commit begins a change (e.g. Rename _key to _dockey), but leaves the code in a broken state to be patched by later commits (e.g. Apply naming within unit tests). This means as a reviewer I cannot effectively review the rename of _key to _dockey unless I review a guessed range of commits (and hope nothing else is interleaved within that range).

In the future I would suggest either breaking up the PR (I understand why you may want it in one here). Or broadening the scope of the commits so that reviewers can review a single, complete, build-passes change in a single commit.

I'm not yet asking for you to change this in this PR, but thought I'd leave the feedback.

I appreciate the feedback and understand the concern, however due to the nature of this PR (renaming). I had to pick between either (1) smaller / consumable / somewhat structured but not clean commits (as the build might fail), OR (2) according to your suggestion range of all commits that form into a clean state as one big commit (i.e. passing build / complete change).

I went with (1) as according to the nature of the PR the review is mainly about weather the renaming makes sense or not, with the assumption that no logic has changed hence it would build. (2) would also in some cases require to change cids multiple times, for individual complete change to pass the build, or like mentioned a really large commit body if I want to change cids once (which to be fair is possible and very easy for me to create even now, I can merge quickly all renaming change commits into one).

Nonetheless will keep this in mind in future for non-rename PRs, I do normally keep commits clean as much as possible.

@AndrewSisley
Copy link
Contributor

AndrewSisley commented Jan 3, 2024

would also in some cases require to change cids multiple times, for individual complete change to pass the build

This is fine, and would quite probably be a net time time saver for the team in a 3000 line PR. You could alternatively sacrifice the build and bundle the cid changes (only) into a single commit at the end.

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments submitted now as I'm context switching mid-review.

client/collection.go Outdated Show resolved Hide resolved
client/doc_id.go Outdated Show resolved Hide resolved
client/document.go Outdated Show resolved Hide resolved
@@ -32,23 +32,16 @@ const (
// This list is incomplete and undefined errors may also be returned.
// Errors returned from this package may be tested against these errors with errors.Is.
var (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why were so many errors deleted? Were they unused?

Copy link
Member Author

@shahzadlone shahzadlone Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all deleted errors were unused. It's because the linter doesn't catch unused public variables when they are become unused by developers, the same way it catches unused private variables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks then for deleting them :)

@@ -36,14 +32,21 @@ const (
OrderClause = "order"
DepthClause = "depth"

DocIDArgName = "docID"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: These are much better const names than the originals, thank you :)

GroupFieldName = "_group"
DeletedFieldName = "_deleted"
SumFieldName = "_sum"
VersionFieldName = "_version"

// New generated document id from a backed up document,
// which might have a different _docID originally.
NewDocIDFieldName = "_docIDNew"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why has the position of new in this constant changed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously when we had a _key, we had a corresponding _newKey within backup and recover functionality to mark existing and new keys.

Now with the introduction of a camel case name: _docID, I didn't like _newdocID for obvious reasons nor _newDocID for the simple reason that if we want to do a regex search in the future for all usages of docID it won't match.

So I went with _docIDNew.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I'm not sure a slightly simpler regex search is worth the relatively unexpected 'new' location. But I'm not going to push for a change here, it isn't very important to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping as is, unless someone has a stronger opinion

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just used in backups so it's low impact. I'm ok with _docIDNew.

db/base/collection_keys.go Outdated Show resolved Hide resolved
db/collection.go Outdated Show resolved Hide resolved
db/base/collection_keys.go Outdated Show resolved Hide resolved
db/base/collection_keys.go Outdated Show resolved Hide resolved
Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just some small comments added. Have now reviewed the full PR.

praise: Thank you very much for spreading further the use of the request consts, and cleaning up their names.

Strongly suggest getting in touch with Fred when you are looking to merge this, he has an open PR atm that overlaps a fair amount with this one.

@@ -1,38 +0,0 @@
## defradb client document

Create, read, update, and delete documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why have these files been deleted?

Copy link
Member Author

@shahzadlone shahzadlone Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem outdated docs, do lemme know if anything isn't outdated according to you but I deleted (my guess is that overtime the make docs command generates new documentation, but does not remove old ones).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are auto-generated?

Copy link
Member Author

@shahzadlone shahzadlone Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea so for example docs/cli/defradb_client_collection_docIDs.md was regenerated with new file name but was originally docs/cli/defradb_client_collection_keys.md

And this file that I deleted: defradb_client_document.md I am pretty sure has been out dated by defradb_client_collection.md: https://github.com/shahzadlone/defradb/blob/gql-renaming-phase1/docs/cli/defradb_client_collection.md

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay nice - I did not know this :)

I've opened #2166 and #2165

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't either LOL,

Cheers for opening these

planner/commit.go Show resolved Hide resolved
planner/type_join.go Outdated Show resolved Hide resolved
@shahzadlone
Copy link
Member Author

Looks good to me, just some small comments added. Have now reviewed the full PR.

praise: Thank you very much for spreading further the use of the request consts, and cleaning up their names.

Strongly suggest getting in touch with Fred when you are looking to merge this, he has an open PR atm that overlaps a fair amount with this one.

Cheers thanks for reviewing this PR :), for sure will coordinate with @fredcarle

@@ -429,7 +429,7 @@ func (vf *VersionedFetcher) Close() error {
}

// NewVersionedSpan creates a new VersionedSpan from a DataStoreKey and a version CID.
func NewVersionedSpan(dockey core.DataStoreKey, version cid.Cid) core.Spans {
func NewVersionedSpan(dsKey core.DataStoreKey, version cid.Cid) core.Spans {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: This was a confusing one before this change. Thanks for fixing!

db/collection_delete.go Outdated Show resolved Hide resolved
db/collection_delete.go Outdated Show resolved Hide resolved
db/collection_get.go Outdated Show resolved Hide resolved
db/collection_get.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot for doing this Shahzad. It a really nice change and I love that it removes the confusion I sometimes had when not paying too too much attention between dockey and datastore key. Now it's unmistakable.

Please resolve the conversations before merging :)

@shahzadlone shahzadlone modified the milestones: DefraDB v0.7, DefraDB v0.9 Jan 4, 2024
@shahzadlone shahzadlone merged commit 9da4dab into sourcenetwork:develop Jan 4, 2024
58 of 59 checks passed
@shahzadlone shahzadlone deleted the gql-renaming-phase1 branch January 4, 2024 23:03
shahzadlone added a commit to shahzadlone/defradb that referenced this pull request Jan 22, 2024
## Relevant issue(s)
- Subtask of sourcenetwork#1750 [EPIC]
- [x] Resolves sourcenetwork#1752
- [x] Resolves sourcenetwork#1272


## BREAKING CHANGE:
- Use of `_key` to access a document's unique id is now deprecated,
instead use `_docID`.
- Use of `dockey`/`id` is now deprecated, instead use `docID`.
- Use of `dockeys`/`ids` is now deprecated, instead use `docIDs`.


## Description
- [x] Rename `_key` to `_docID` everywhere.
- [x] Rename `_keys` to `docIDs` in explain the response.
- [x] Rename `_newKey` to `_docIDNew` in backup/recover functionality.
- [x] Fix `_docID` tests.
- [x] Fix explain and backup/recover functionality tests.
- [x] Fix the collectionID order for a P2P test (leaving a note as order
was reverted).
- [x] Update all cids.
- [x] Rename all files with `key(s)|dockey(s)` in the name to
`doc_id(s)`.
- [x] Document breaking change to pass change detector.

## For Reviewers:
- Main commits to review are the `PR(MAIN)` commits. 
- If you have more time `PR(MINOR) and PR(*TEST)` commits are good to go
over


## Disclaimer / Discussion:
I do not like these non-underscored `docID/docIDs`, would be in favor of
underscoring these :

```
query {
    User(docIDs: ["bae-6a6482a8-24e1-5c73-a237-ca569e41507d"]) {
        _docID
    }
}
```

```
query {
    Users {
        Name
        _docID
        _version {
            docID
        }
    }
}
```

```
Request: ` {
    commits(groupBy: [docID], order: {docID: DESC}) {
        docID
    }
}`,
Results: []map[string]any{
    {
        "docID": "bae-f54b9689-e06e-5e3a-89b3-f3aee8e64ca7",
    },
    {
        "docID": "bae-72f3dc53-1846-55d5-915c-28c4e83cc891",
    },
},

```

EDIT: Above was resolved with sourcenetwork#2162, to do out of this PR.


## Limitations (out of scope of this PR):
- sourcenetwork#1467
- sourcenetwork#1751
- sourcenetwork#1550
- sourcenetwork#2162
shahzadlone added a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
## Relevant issue(s)
- Subtask of sourcenetwork#1750 [EPIC]
- [x] Resolves sourcenetwork#1752
- [x] Resolves sourcenetwork#1272


## BREAKING CHANGE:
- Use of `_key` to access a document's unique id is now deprecated,
instead use `_docID`.
- Use of `dockey`/`id` is now deprecated, instead use `docID`.
- Use of `dockeys`/`ids` is now deprecated, instead use `docIDs`.


## Description
- [x] Rename `_key` to `_docID` everywhere.
- [x] Rename `_keys` to `docIDs` in explain the response.
- [x] Rename `_newKey` to `_docIDNew` in backup/recover functionality.
- [x] Fix `_docID` tests.
- [x] Fix explain and backup/recover functionality tests.
- [x] Fix the collectionID order for a P2P test (leaving a note as order
was reverted).
- [x] Update all cids.
- [x] Rename all files with `key(s)|dockey(s)` in the name to
`doc_id(s)`.
- [x] Document breaking change to pass change detector.

## For Reviewers:
- Main commits to review are the `PR(MAIN)` commits. 
- If you have more time `PR(MINOR) and PR(*TEST)` commits are good to go
over


## Disclaimer / Discussion:
I do not like these non-underscored `docID/docIDs`, would be in favor of
underscoring these :

```
query {
    User(docIDs: ["bae-6a6482a8-24e1-5c73-a237-ca569e41507d"]) {
        _docID
    }
}
```

```
query {
    Users {
        Name
        _docID
        _version {
            docID
        }
    }
}
```

```
Request: ` {
    commits(groupBy: [docID], order: {docID: DESC}) {
        docID
    }
}`,
Results: []map[string]any{
    {
        "docID": "bae-f54b9689-e06e-5e3a-89b3-f3aee8e64ca7",
    },
    {
        "docID": "bae-72f3dc53-1846-55d5-915c-28c4e83cc891",
    },
},

```

EDIT: Above was resolved with sourcenetwork#2162, to do out of this PR.


## Limitations (out of scope of this PR):
- sourcenetwork#1467
- sourcenetwork#1751
- sourcenetwork#1550
- sourcenetwork#2162
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing Related to any test or testing suite code quality Related to improving code quality deprecate Indicates something is deprecated.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename instances of _key to _docID Make dockey and id request arguments more consistent
3 participants