Transfer learning on a large dataset #50

szjyoo · 2024-05-27T08:53:01Z

Hello author. I tried to train CAT-Net on the DocTamper dataset (120000 images). I look forward to your answer as to whether I should change self.smallest = 1869 to self.smallest = 120000 in the data_core.py, or should I train with a subset of the full dataset in each round.

CauchyComplete · 2024-05-27T11:03:13Z

Hello :)

If you are adding the new DocTamper dataset (120k images) to the existing dataset setup, the smallest dataset is still IMD, so self.smallest should be 1869 (the number of images in IMD).
If you are using only the DocTamper dataset without any other datasets, then it would be correct to set self.smallest to 120k. However, this would mean that 120k images are used in one epoch, which would take too long. Since the original training method of CAT-Net uses 1869*10 images per epoch, it might be a good idea to set self.smallest to 1869*10.

szjyoo · 2024-05-27T11:37:20Z

Thank you very much for your answer. I'm only using DocTamper as a dataset. My validation set and testing set are 10,000 and 30,000 images respectively, considering the training efficiency and training performance, i want to kown whether I set self.smallest to 10,000 or 1869*10 will get better results.Looking forward to your answer.

1513691610 · 2024-06-04T09:33:42Z

Thank you very much for your answer. I'm only using DocTamper as a dataset. My validation set and testing set are 10,000 and 30,000 images respectively, considering the training efficiency and training performance, i want to kown whether I set self.smallest to 10,000 or 1869*10 will get better results.Looking forward to your answer.

Hello, I am also training Catnet with Doctamper. Can you leave me a contact information to discuss together? Thank you

Ridha15 · 2024-07-31T11:14:22Z

Thank you very much for your answer. I'm only using DocTamper as a dataset. My validation set and testing set are 10,000 and 30,000 images respectively, considering the training efficiency and training performance, i want to kown whether I set self.smallest to 10,000 or 1869*10 will get better results.Looking forward to your answer.

Hello, I am also training Catnet with Doctamper. Can you leave me a contact information to discuss together? Thank you

Hey. I want to try the same. Can we connect to discuss?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transfer learning on a large dataset #50

Transfer learning on a large dataset #50

szjyoo commented May 27, 2024

CauchyComplete commented May 27, 2024 •

edited

Loading

szjyoo commented May 27, 2024

1513691610 commented Jun 4, 2024

Ridha15 commented Jul 31, 2024

Transfer learning on a large dataset #50

Transfer learning on a large dataset #50

Comments

szjyoo commented May 27, 2024

CauchyComplete commented May 27, 2024 • edited Loading

szjyoo commented May 27, 2024

1513691610 commented Jun 4, 2024

Ridha15 commented Jul 31, 2024

CauchyComplete commented May 27, 2024 •

edited

Loading