Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible public data sets? #6

Open
topepo opened this issue Aug 31, 2022 · 2 comments
Open

possible public data sets? #6

topepo opened this issue Aug 31, 2022 · 2 comments

Comments

@topepo
Copy link
Collaborator

topepo commented Aug 31, 2022

We should probably try to get some real raw data. https://data.mendeley.com is fairly good (if you have low expectations).

Searching for "hplc spectra" within datasets yield this, which seems helpful.

Anyway, we can add sources to this issue thread.

@JamesHWade
Copy link
Owner

I added a list of data sources here and copied below. I have some real data that isn't approved for external release, and I may simulate some data to match that structure.

There are a number of data sets in modeldata and prospectr packages. These include modeldata::meats and prospectr::NIRsoil.

Mendeley Data offers data inlcuded as part of various publications. A few possibilities include:

Machine Learning of MS dataset
Data for: A Sensitive Quantitative Analysis of Abiotically Synthesized Short Homopeptides using Ultraperformance Liquid Chromatography and Time-of-Flight Mass Spectrometry

There are also some repositories that might be of use:

Crystallography Open Database
NMRShiftDB
Spectral Database for Organic Compounds, SDBS
NIST Chemistry WebBook

@topepo
Copy link
Collaborator Author

topepo commented Sep 8, 2022

Here's a fairly large data set described here: https://chemom2019.sciencesconf.org/resource/page/id/13.html with the source data linked at the bottom. We don't know the actual wavelengths or the test set outcome data but it looks like a nice data set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants