You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I've recently been attempting to retrain MolScribe using the scripts you provide in scripts. First of all, thanks for providing all your data and training scripts, extremely helpful.
Second, when running train_uspto_joint_chartok.sh I get a series of missing image warnings like:
I downloaded the ZIP from the link provided in the README: https://www.dropbox.com/s/3podz99nuwagudy/uspto_mol.zip?dl=0, and unzipped it into data. This is not an issue when running train_uspto_joint_chartok_1m680k.sh. The problem is the uspto_mol/train_200k.csv has paths to images not provided in the ZIP archive.
It would be good to be able to run the smaller training set for quicker comparisons to your saved checkpoint. Let me know if this is fixable. Thanks for your time and this model!
The text was updated successfully, but these errors were encountered:
Sorry for the late reply. In our paper, we only keep the model trained with 1M synthetic data and 680K patent data. Therefore only these data are released and we encourage to use them for future comparison.
If you still want that 200K data, please send me an email at [email protected]. I can give you a link for private download.
Thanks for the explanation, the 200k data isn't necessary for me. But, it may be good to make a note that it is not released in the 200k training script or README, however.
Hello, I've recently been attempting to retrain
MolScribe
using the scripts you provide inscripts
. First of all, thanks for providing all your data and training scripts, extremely helpful.Second, when running
train_uspto_joint_chartok.sh
I get a series of missing image warnings like:I downloaded the ZIP from the link provided in the README: https://www.dropbox.com/s/3podz99nuwagudy/uspto_mol.zip?dl=0, and unzipped it into
data
. This is not an issue when runningtrain_uspto_joint_chartok_1m680k.sh
. The problem is theuspto_mol/train_200k.csv
has paths to images not provided in the ZIP archive.It would be good to be able to run the smaller training set for quicker comparisons to your saved checkpoint. Let me know if this is fixable. Thanks for your time and this model!
The text was updated successfully, but these errors were encountered: