Advanced handling of duplicates in `insert1` #1049

dimitri-yatsenko · 2022-09-01T00:25:53Z

Feature Request

Allow new ways of handling different types of duplicates in insert1

Problem

Currently, there is only one way to skip inserts skip_duplicates=True ignores all duplicates, including primary key or secondary unique indexes. There are cases, however, when only specific types of duplicates should be skipped.

Requirements

Condition 1. For a primary duplicate, we need an option to ignore the duplicate only if the entry matches on all the secondary unique indexes. This is helpful for tables that map unique indexes between two identification systems.

Condition 2. For a secondary duplicate, it may be helpful to include in the error message the primary key of the duplicate entry already in the database.

Both conditions will require a second query and only apply to insert1 rather than insert.

This could be addressed by allowing other values besides True or False for the skip_duplicates argument in insert1. Considering that both conditions should probably appear together, we can name this option "match":

table.insert1(entry, skip_duplicates='match')

Alternative Considerations

This feature was discussed for addressing: datajoint/element-interface#42

Potentially, we could implement this feature in element-interface as a general datajoint utility but not part of datajoint itself. This depends on how clear and common the functionality is.

The text was updated successfully, but these errors were encountered:

MadhuMPandurangi · 2022-09-08T06:09:39Z

Hi @dimitri-yatsenko, I'm interested to contribute. I have idea about docker, MYSQL and django.
Can you please assign me this issue. It'll be of great help for me as a first time contributor.

Thank you

dimitri-yatsenko · 2022-09-12T14:21:59Z

@MadhuMPandurangi Thank you for your interest in contributing. The DataJoint team already has ongoing developments to address this issue and we expect to release a fix shortly.

However, you are welcome to suggest your implementation and issue a PR. The DataJoint team will provide detailed and timely feedback. We will merge the optimal features of both solutions.

guzman-raphael · 2022-09-13T23:27:42Z

Thanks for your interest @MadhuMPandurangi! We always appreciate any help you can provide. 😃

We welcome all PR's but I might suggest having a look at these good-first-issues. We've recently updated them and they should reflect some easier ones to get started.

Please let me know if any of those catch your eye and I can assign them to you.

dimitri-yatsenko added enhancement awaiting-triage labels Sep 1, 2022

dimitri-yatsenko changed the title ~~Advanced duplicate handling~~ Advanced handling of duplicates in insert1 Sep 1, 2022

dimitri-yatsenko self-assigned this Sep 14, 2022

jverswijver removed the awaiting-triage label Oct 10, 2022

CBroz1 mentioned this issue Dec 1, 2022

Add insert1_skip_full_duplicates and remove recursive_search datajoint/element-interface#43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advanced handling of duplicates in `insert1` #1049

Advanced handling of duplicates in `insert1` #1049

dimitri-yatsenko commented Sep 1, 2022 •

edited

Loading

MadhuMPandurangi commented Sep 8, 2022 •

edited

Loading

dimitri-yatsenko commented Sep 12, 2022

guzman-raphael commented Sep 13, 2022

Advanced handling of duplicates in insert1 #1049

Advanced handling of duplicates in insert1 #1049

Comments

dimitri-yatsenko commented Sep 1, 2022 • edited Loading

Feature Request

Problem

Requirements

Alternative Considerations

MadhuMPandurangi commented Sep 8, 2022 • edited Loading

dimitri-yatsenko commented Sep 12, 2022

guzman-raphael commented Sep 13, 2022

Advanced handling of duplicates in `insert1` #1049

Advanced handling of duplicates in `insert1` #1049

dimitri-yatsenko commented Sep 1, 2022 •

edited

Loading

MadhuMPandurangi commented Sep 8, 2022 •

edited

Loading