-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
31 Adding metadata retrieval to get_record()
#37
31 Adding metadata retrieval to get_record()
#37
Conversation
Linked to inbo#31: A small tweek to get `object_id`, `inspection_id` and `insp_order` when using `get_records()`. They are saved alongside data in the output. The `rename_by_id()` has been slightly modified to accept id's not present in inspection_fields.
Linked to inbo#31 All the metadata can be saved by `get_records()`. A get_metadata flag has been added to `get_records()`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for having a go at this, some notes:
-
get_metadata
is a verb, I don't like using verbs for function arguments but like to reserve them for functions. In this case, I think it's justified, but maybe we should go for:include_metadata
orshow_metadata
instead? -
I know I'm not the best example for this package, but we ought to cover changes with new tests. Let's include some in this PR so we know we aren't breaking anything. I'll help out. It's a bit silly for me to mention this since I haven't done it myself here, but I think this is where we should draw the line. Especially if you we are going to slow down development soon. No worries if this is completely new to you, either just experiment and we'll go trough it together, or ask me for some examples. https://testthat.r-lib.org/
Can you add tests for:
get_record()
:
- The new argument you added
- The new assertation you added
rename_by_id()
:
- new functionality
If you have multiple outcomes, test both, so one for argument TRUE one for argument FALSE. If you expect an error, it's alright to cover the error only, but it's best practise to match it as exactly as possible (and not just any error).
I'll add some more tests, either in this PR or in a separate one. Depending on how quick you are/your preference!
-
Before submitting for review, please run R CMD CHECK locally (check under the build menu in RStudio or via devtools). The CI will catch it, but it's neater if your PR doesn't immediately fail. I've generated the documentation and now it's passing.
-
There are some stylistic things, are you using a linter? Consider lintr! Have a look at style.tidyverse.org, no biggie. Usually in review I'll just fix style issues for the contributor, but in this case I'll leave the choice up to you. Let me know if you'd like me to have a go or would like to have a look at it yourself.
Well done! Really appreciate it. 👍
I tend to agree. We have to keep the user of the package in mind. In this case, you'd be the user so with the optional flag, I think we'd be fine leaving it in the data. However, a solution I see is to return an object that has the metadata as a child or attribute. rgbif works this way for some function returns. If we go this route, I'd love to use this as an excuse to give S7 a go: https://rconsortium.github.io/S7/articles/S7.html Also, it's a better design pattern to use an enum even if there only two choices: https://design.tidyverse.org/boolean-strategies.html, are you open to giving this pattern a try? |
Just realised why we don't have any tests, because of the risk of exposing API keys. The solution to this is vcr as is used in rgbif. https://books.ropensci.org/http-testing/ for more information. With this in mind it might be better to just keep all the testing in a separate PR that we can work together on but I'll be responsible for. |
I will then go for
I had never heard of these, but they are definitely worth the use. Thanks for the tips ! I will use lintr from now on and for this PR.
Yes, I can try it, it should not take too long. The enum to replace the |
What do you think about |
@PietrH, this is indeed a good idea. For the included
|
@Turtle6665 I'm having some trouble visualizing the first option. Could you give a (mock) example? Just a few rows of fake data is fine! |
@PietrH, This would be the first option: |
My vote goes for the third option |
@PietrH, The Removing this row is not the way to go as some observations are in still in double (same data and same metadata). |
How about only removing the columns after you've done the distinct? Are you sure this is an issue? You are more suitable to answer this than me because I don't actually use the iAsset data. But to me it doesn't seem important to get the same number of rows with all metadata options. If you are interested in finding duplicates, you'd want to set metadata appropriately wouldn't you? On second thought, why are we even offering metadata I propose we get rid of the option PS: When you are done working on this, please request my review again and I'll have a more in depth look |
I haven't found the time to look at this yet, but it's on my radar. |
@PietrH can this PR be approved ? |
@SanderDevisscher Can you have a look at the changes starting with 32b7dd0, I trust your judgement. Make sure the CI tests pass. If nothing else we should setup some unit tests for this package. Without them reviewing is becoming more and more difficult. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works like I would expect it to
Requested changes were implemented
Thanks for the review Sander, ok to merge for me. I'll leave the honors to Turtle6665 |
@Turtle6665 is no longer active at our team so I did the merge |
Fixes #31.
List of changes
get_records()
.get_records()
. Possible values areid
(only includeinsp_order
), andall
(include all available metadata). Default isid
.rename_by_id()
has been modified to accept field IDs not present ininspection_fields
.A possible improvement I currently see
get_record(get_metadata="all")
, it's difficult to tell apart metadata and data columns. Saving metadata in a totally different way could be the way to go, but I don't know how it could be clear and efficient at the same time.