Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced handling of duplicates in insert1 #1049

Open
dimitri-yatsenko opened this issue Sep 1, 2022 · 3 comments
Open

Advanced handling of duplicates in insert1 #1049

dimitri-yatsenko opened this issue Sep 1, 2022 · 3 comments
Assignees

Comments

@dimitri-yatsenko
Copy link
Member

dimitri-yatsenko commented Sep 1, 2022

Feature Request

Allow new ways of handling different types of duplicates in insert1

Problem

Currently, there is only one way to skip inserts skip_duplicates=True ignores all duplicates, including primary key or secondary unique indexes. There are cases, however, when only specific types of duplicates should be skipped.

Requirements

Condition 1. For a primary duplicate, we need an option to ignore the duplicate only if the entry matches on all the secondary unique indexes. This is helpful for tables that map unique indexes between two identification systems.

Condition 2. For a secondary duplicate, it may be helpful to include in the error message the primary key of the duplicate entry already in the database.

Both conditions will require a second query and only apply to insert1 rather than insert.

This could be addressed by allowing other values besides True or False for the skip_duplicates argument in insert1. Considering that both conditions should probably appear together, we can name this option "match":

table.insert1(entry, skip_duplicates='match')

Alternative Considerations

This feature was discussed for addressing: datajoint/element-interface#42

Potentially, we could implement this feature in element-interface as a general datajoint utility but not part of datajoint itself. This depends on how clear and common the functionality is.

@dimitri-yatsenko dimitri-yatsenko changed the title Advanced duplicate handling Advanced handling of duplicates in insert1 Sep 1, 2022
@MadhuMPandurangi
Copy link

MadhuMPandurangi commented Sep 8, 2022

Hi @dimitri-yatsenko, I'm interested to contribute. I have idea about docker, MYSQL and django.
Can you please assign me this issue. It'll be of great help for me as a first time contributor.

Thank you

@dimitri-yatsenko
Copy link
Member Author

@MadhuMPandurangi Thank you for your interest in contributing. The DataJoint team already has ongoing developments to address this issue and we expect to release a fix shortly.

However, you are welcome to suggest your implementation and issue a PR. The DataJoint team will provide detailed and timely feedback. We will merge the optimal features of both solutions.

@guzman-raphael
Copy link
Collaborator

Thanks for your interest @MadhuMPandurangi! We always appreciate any help you can provide. 😃

We welcome all PR's but I might suggest having a look at these good-first-issues. We've recently updated them and they should reflect some easier ones to get started.

Please let me know if any of those catch your eye and I can assign them to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants