Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model training for StressForceOutput #92

Open
wkylee14 opened this issue Jun 27, 2024 · 4 comments
Open

Model training for StressForceOutput #92

wkylee14 opened this issue Jun 27, 2024 · 4 comments

Comments

@wkylee14
Copy link

Hi,

Thanks for making the Allegro repo. be public. I was just wondering if you have any guidance or though on preparing a config file when training the Allegro that predicts stress tensor outputs as well in addition to forces and total potential energies (StressForceOutput).

I've tried several attempts for my dataset with the following configurations, but losses for stress tensor decreases very slowly and marginally while losses for forces or energies keep decreasing after a certain number of training epochs.

Attempt 1: Applying PerAtomMSELoss
loss_coeffs:
forces: 1.
total_energy:
- 1.
- PerAtomMSELoss
stress:
- 1.
- PerAtomMSELoss

Attempt 2: Assigning more weights to loss for stress tensor predictions
loss_coeffs:
forces: 1.
total_energy:
- 1.
- PerAtomMSELoss
stress: 100.

Attempt 3: Assigning simple MSE loss function for stress tensor
loss_coeffs:
forces: 1.
total_energy:
- 1.
- PerAtomMSELoss
stress: 1.

Otherwise, do you recommend not to add loss for stress tensor?
Any recommendation or guidance when I use Allegro to predict stress tensors, forces, and potential energies would be welcome!

Kind regards,

@Linux-cpp-lisp
Copy link
Collaborator

Hi @wkylee14 ,

Thanks for your interest in our code!

stress should use a normal, and not PerAtom loss. Stress training issues are often linked to incorrect labels, either due to DFT issues, unit conversion issues, or an incorrect sign convention. (We follow the convention stress = (-1 / volume) * virial as discussed in various other threads on the nequip repo: https://github.com/mir-group/nequip/blob/main/nequip/nn/_grad_output.py#L346-L349).

@biglinn
Copy link

biglinn commented Nov 27, 2024

Hi, @Linux-cpp-lisp

I have extxyz file like this:
Lattice="7.749908999 0.0 0.0 3.874954499 6.71161807 0.0 3.874954499 2.237206023 6.3277742" Properties=species:S:1:pos:R:3:forces:R:3 energy=-85.53947668 stress="0.013635762572225896 -0.001867459530464271 -0.00034451257922137816 -0.001867459530464271 0.00040893119490961933 0.003952978665259525 -0.00034451257922137816 0.003952978665259525 -0.0004071773308452461" free_energy=-85.53947668 pbc="T T T"
And I set loss function as follows:

loss_coeffs:
  forces: 1.
  stress: 1.
  total_energy:
    - 1.
    - PerAtomMSELoss

Is this correct?
I find that there is virial instead of stress in https://github.com/mir-group/nequip/blob/main/configs/minimal_stress.yaml#L56C1-L58C10.
Since stress = (-1 / volume) * virial, is there a trick for unit conversion?

@biglinn
Copy link

biglinn commented Nov 27, 2024

After I set loss function as above, the loss is as follows:

  Train      #    Epoch      wal       LR       loss_f  loss_stress       loss_e         loss        f_mae       f_rmse   stress_mae  stress_rmse        e_mae       e_rmse
! Train             359 2939.468 7.81e-06     0.000128     4.47e-06     3.61e-06     0.000136      0.00832       0.0113      0.00162      0.00212       0.0254       0.0304
! Validation        359 2939.468 7.81e-06        0.122     2.53e-05         4.32         4.44        0.273        0.349      0.00396      0.00503         33.2         33.2

The energy loss of validation dataset is so large. It's strange.

@biglinn
Copy link

biglinn commented Nov 27, 2024

Hi, @wkylee14

Could you please show me your configuration file?
Do you use extxyz file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants