You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have detected the presence of Bibliographic Resources in OpenCitations Meta that are linked to multiple external IDs associated, in the real world, to publications that are periodically published in the same venue (journal). E.g., editorial comments or recurrent news columns in a specific research field.
Although the periodicity of these publications might not be relevant to the cause of the problem (i.e. having IDs of separate resources all linked to a single one in Meta), it seems appropriate to point it out and take it into consideration, since these scenarios do not seem to be – or at least not exclusively – generated by software bugs in Meta (contrary to the cases where different real-world entities that have no perceivable common features have been erroneously merged); rather, they seem to result from errors in the data provided by OpenCitations' primary sources (e.g. Crossref, DataCite, PubMed, OpenAire, ecc.).
For example, let's consider the case of br/061903839782. By querying the Meta SPARQL endpoint, we can see that this journal article has 76 DOIs and 17 PMIDs:
Then we searched these external IDs in the databases of the primary sources by querying the appropriate APIs, to obtain information on their current representation (even though, of course, the current state of the data exposed via API might differ from the one at the time of the ingestion in Meta). For the sake of brevity, we only post the script we used to obtain current data on these external IDs in PubMed (since it is this source that generates the error in this case).
By running the get_pmids_for_doi() function and passing to it the list of DOIs associated to br/061903839782 in Meta we obtain, in the form of a dictionary, the DOI-to-PMID mapping available in the current PubMed data. From these results we can see how the great majority of the DOIs is pointing to multiple PMIDs in PubMed, which exaplains the fact that so many IDs point to the same resource in Meta. Nonetheless, it should be noticed that 21 among the queried DOIs are uniquely associated to a single PMID, as of the current state of PubMed: the reason why they, too, point to br/061903839782 is likely the fact the data in PubMed has probably been updated at a time following the ingestion of this entity (and its external IDs) in Meta.
We have detected the presence of Bibliographic Resources in OpenCitations Meta that are linked to multiple external IDs associated, in the real world, to publications that are periodically published in the same venue (journal). E.g., editorial comments or recurrent news columns in a specific research field.
Although the periodicity of these publications might not be relevant to the cause of the problem (i.e. having IDs of separate resources all linked to a single one in Meta), it seems appropriate to point it out and take it into consideration, since these scenarios do not seem to be – or at least not exclusively – generated by software bugs in Meta (contrary to the cases where different real-world entities that have no perceivable common features have been erroneously merged); rather, they seem to result from errors in the data provided by OpenCitations' primary sources (e.g. Crossref, DataCite, PubMed, OpenAire, ecc.).
For example, let's consider the case of br/061903839782. By querying the Meta SPARQL endpoint, we can see that this journal article has 76 DOIs and 17 PMIDs:
Then we searched these external IDs in the databases of the primary sources by querying the appropriate APIs, to obtain information on their current representation (even though, of course, the current state of the data exposed via API might differ from the one at the time of the ingestion in Meta). For the sake of brevity, we only post the script we used to obtain current data on these external IDs in PubMed (since it is this source that generates the error in this case).
By running the
get_pmids_for_doi()
function and passing to it the list of DOIs associated to br/061903839782 in Meta we obtain, in the form of a dictionary, the DOI-to-PMID mapping available in the current PubMed data. From these results we can see how the great majority of the DOIs is pointing to multiple PMIDs in PubMed, which exaplains the fact that so many IDs point to the same resource in Meta. Nonetheless, it should be noticed that 21 among the queried DOIs are uniquely associated to a single PMID, as of the current state of PubMed: the reason why they, too, point to br/061903839782 is likely the fact the data in PubMed has probably been updated at a time following the ingestion of this entity (and its external IDs) in Meta.For more information on the issue described hereby and on the operations made to examine it, see the following gist: https://gist.github.com/eliarizzetto/c984bb85642aee7ae9eeb0761a9f0d40.
The text was updated successfully, but these errors were encountered: