-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider flexibility in rules for matching an article by title #1297
Comments
Not too flexible: #1299 (comment) |
FYI, CrossRef etc uses a lot of the same heuristics to match titles (for the purposes of matching VOR to preprint) |
New case:
There's probably some npm packages for normalizing stuff, but which ones and how far reaching is another question. So far I've avoided all of these. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Background
The article matching has been iterated on many times for different edge cases: #1074 #1124 #848 and there are services aimed at resolving this information e.g. #1295. From my observations, this works pretty well, but there are cases where no article is matched, due to ambiguity.
Currently, an author's input title must be an exact subset of the record retrieved from either PubMed or CrossRef after 'sanitization':
- trimming:
const trimmed = _.trim( raw , ' .')
- lower casing:
const lower = _.toLower( trimmed )
- removal of non-words:
const clean = lower.replace(/[\W_]+/g, ' ')
Problems observed
There remain cases where we might want to reasonably relax conditions. For example:
eLife 2024: Defining cell type-specific immune responses in a mouse model of allergic contact dermatitis by single-cell transcriptomics"
Neurons enhance blood-–brain barrier function via upregulating claudin-5 and VE-cadherin expression due to glial cell line-derived neurotrophic factor secretion"
Details
There are potential pitfalls to increasing flexibility, notably, the title of a manuscript can change between preprints, versions and the final version of record.
Tasks
The text was updated successfully, but these errors were encountered: