Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Enable to easily join features onto artifacts via Artifact.df() #2238

Merged
merged 10 commits into from
Nov 30, 2024

Conversation

falexwolf
Copy link
Member

@falexwolf falexwolf commented Nov 29, 2024

Summary

You can now easily join features onto the Artifact registry.

image

We find the features from the schema definition, independent of whether they are used as observation-level or dataset-level metadata.

ln.Feature(name="cell_medium", dtype="cat[ULabel]").save()
ln.Feature(name="cell_type_by_expert", dtype="cat[bionty.CellType]").save()
ln.Feature(name="cell_type_by_model", dtype="cat[bionty.CellType]").save()
ln.Feature(name="temperature", dtype="float").save()
ln.Feature(name="study", dtype="cat[ULabel]").save()
ln.Feature(name="date_of_study", dtype="date").save()
ln.Feature(name="study_note", dtype="str").save()

Here is another screenshot, scrolling the table further to the right.

image

More changes.

  • QuerySet.df(include=[...]) now relies on .annotate() instead of pandas for simple join on Django fields -- this always performs an "outer join", and hence the join argument got removed
  • ln.view(df) displays a DataFrame with types in the columns

Docs changes

Overhauled the registries guide.

Before After
image Better section names
image
First impression was on generating "toy data"image The first impression is now on a "Get an overview" section image
/ The guide now leverages small_dataset1 and small_dataset2 as many tests & the quickstart imageimage
/ It's documented how to include fields from other registries and these start at position 3 in the columns image
/ You can now easily join features onto the Artifact registry image image
image image
image image
image image

Materials

Needs:

@falexwolf falexwolf changed the title Annotatedf ✨ Enable to visualize features in Artifact.df() Nov 29, 2024
Copy link

codecov bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 95.21277% with 9 lines in your changes missing coverage. Please review.

Project coverage is 92.94%. Comparing base (c54f99f) to head (66045c9).
Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
lamindb/_query_set.py 95.58% 6 Missing ⚠️
lamindb/_view.py 95.45% 2 Missing ⚠️
lamindb/_feature.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2238      +/-   ##
==========================================
+ Coverage   92.36%   92.94%   +0.57%     
==========================================
  Files          54       54              
  Lines        6566     6818     +252     
==========================================
+ Hits         6065     6337     +272     
+ Misses        501      481      -20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Nov 29, 2024

@github-actions github-actions bot temporarily deployed to pull request November 29, 2024 23:17 Inactive
@falexwolf falexwolf changed the title ✨ Enable to visualize features in Artifact.df() ✨ Enable to include features in Artifact.df() Nov 30, 2024
@github-actions github-actions bot temporarily deployed to pull request November 30, 2024 02:28 Inactive
@falexwolf falexwolf merged commit 735ef6c into main Nov 30, 2024
16 checks passed
@falexwolf falexwolf deleted the annotatedf branch November 30, 2024 02:30
@falexwolf falexwolf changed the title ✨ Enable to include features in Artifact.df() ✨ Enable to easily join features onto artifacts via Artifact.df() Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant