Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHARC mit pysharc-directory #115

Open
jakobcasa opened this issue May 7, 2024 · 20 comments
Open

SHARC mit pysharc-directory #115

jakobcasa opened this issue May 7, 2024 · 20 comments

Comments

@jakobcasa
Copy link

jakobcasa commented May 7, 2024

Dear all

I have realized that all the jobs I intended to run with the normal SHARC module have run (by accident). So, to be clear, I have set a path for pysharc to the pysharc bin and one for sharc to the sharc bin. I did, by accident, use the pysharc, even though I did all the preparation accordingly to the SHARC and have run it until some point with SHARC. Initially, I tried to run LVC SHARC with the same script, but I have not changed it until now. Is this a problem, or does it only depend on the option I chose when setting up the traj in the first place?

I hope it is the second.

If it helps to look at one of the traj folders let me know.

Thank you

Best
Jakob

@maisebastian
Copy link
Collaborator

Dear Jakob,
I cannot fully follow your explanations. In principle, the input files for sharc.x and for pysharc are nearly identical. However, depending on whether you compiled with or without PYSHARC (see Makefile), some options will or will not be available (with PYSHARC, many of the newer features in SHARC 3 from the Minnesota group are missing, without PYSHARC the NetCDF output writing might not work, etc).
If you do not rely on these options, then your jobs might be fine. But note that running LVC without pysharc is really a waste of ressources.

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

I tried to make It clearer. I did all the preparation according to the normal SHARC approch and started the traj normally (wanted the normal SHARC approch for molecule A). At some point during the MD, I tried to implement the LVC (to run a different molecule (Molecule B) with LVC). At that time, I changed the $SHARC to the $pysharc (which does "point" to the PySHARC bin.) to test if the PYSHARC works. But I did forget to change it back, e.g., when I resubmitted molecule A, I did run it with PYSHARC and not SHARC. I was wondering if this is an issue since during the MD, there was no error. Also, I set in the preparation that SHARC should use ORCA and NOT the LVC Hamiltonian.

Also, what irritates me is that whenback to the $SHARC path, it I set the path does not work anymore.

If you need a traj folder to check, I would be happy to send it to you or ask you to tell me what to look for.

I hope it is a bit clearer now.

Thank you

Best
Jakob

@maisebastian
Copy link
Collaborator

Dear Jakob,
if you only changed $SHARC from one installation to another one, this should not lead to too many problems, at least if you do standard ab initio trajectories using sharc.x. In both installations (with and without PYSHARC) there will be a sharc.x executable that does the same dynamics, with the exceptions noted above. Both installations will also have SHARC_ORCA.py, so that should also work. Of course one should check a trajectory for consistency.

I do not really see a reason why the trajectories should not work anymore when switching back to the original $SHARC path. I would need to look at such a trajectory.

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

I checked, and the PYSHARC/bin does not contain the sharc.x instead, it contains only the pysharc_qmout.py and the pysharc_lvc.py. Maybe I was not telling you this correctly. Sorry. The SHARC folder has one sharc/bin($SHARC), and the pysharc folder contains a bin folder with the files ($pysharc) with pysharc_qmout.py and the pysharc_lvc.py in it.

Sorry for the confusion; how can I check the respective traj for consistency? Also, Is there a possibility to check the calculated PES (e.g. LVC or not). Concerning the time it needed to calculate the next step: the time only changed (~x2) when I reduced the number of CPUs from 16 to 8. Which makes sense. But as far as I understand from you there is also the possibility it does calculate the LVC traj with the normal SHARC approach (which is logically a waste of time). How can I check whether this happened or not in my case?

I can send you a link by Email for the traj to check.

Thank you

Best
Jakob

@maisebastian
Copy link
Collaborator

Hi Jakob,
now I better understand. However, as a user, you should actually never need to use the folder /pysharc/bin. If you run "make install" properly (see Manual), then the files in /pysharc/bin should be copied to /bin/, i.e., to $SHARC. So you can run normal sharc.x as well as pysharc with the same definition of $SHARC.

That said, if your sharc.x/ORCA trajectories kept on going when you changed $SHARC, then this means that the trajectories did not even experience this change of the variable. Note that a currently running executable and its subprocesses will not be affected if you change an environment variable. In SHARC, the calculation will only notice this when you restart, because this will create a new instance of the executable and new subprocesses.

So if your sharc.x/ORCA trajectory kept on going when you changed $SHARC, it likely did not affect anything and the trajectories should be fine. If you want, I can still quickly check a trajectory if you send me a link.

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

I'm sorry for not getting back to you sooner. I did not check the mails yesterday evening.

Thank you for the offer to check the traj quickly. I will send you a link to the files shortly.

When I understand you correctly, it should be no issue since in every step, a new SHARC is called once more, and if the program did not complain - which it did not - it should be good. And I will change the path to the sharc.x eg $SHARC as soon as the calculation is done.

In the meantime, I have experienced a second issue since the cluster I work on will be updating from CentOS 7 to Rocky Linux 9. When I log in on the new system, "all of a sudden", SHARC does not find the ORCA.engrad.ground.grad.tmp:
(Traceback (most recent call last):
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5278, in
main()
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5251, in main
QMin, QMout = getQMout(QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 4531, in getQMout
g = getgrad(logfile, QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 4979, in getgrad
Gfile = open(logfile, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/storage/homefs/jc22p286/m2/sharc/MeCN/trj/batch4/SCR/Singlet_4/TRAJ_00109/master_1/ORCA.engrad.ground.grad.tmp')

I don't quite understand what has changed because I have changed nothing (e.g., the ORCA version and the loaded modules are the same). A standard ORCA calculation works fine on the new system.

Thank you once again for checking the traj.

Best
Jakob

@maisebastian
Copy link
Collaborator

Hi Jakob,
you would need to check your scratch directory
/storage/homefs/jc22p286/m2/sharc/MeCN/trj/batch4/SCR/Singlet_4/TRAJ_00109/master_1/
to see what happened to the ORCA calculation.

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

Thank you for checking the TRAJ. I assume that the others are fine, too.

I checked the ORCA.log, the job did crash in the "Exchange-Correlation gradient ...", but unfortunately, I did not know what to look for further. I uploaded the TRAJ in the same folder as before, if you want to have a look.

But now the following error turned up:
Traceback (most recent call last):
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5278, in
main()
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5242, in main
errorcodes = runjobs(schedule, QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 2888, in runjobs
saveFiles(WORKDIR, jobset[job])
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 3398, in saveFiles
saveMolden(WORKDIR, QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 3453, in saveMolden
shutil.copy(fromfile, tofile)
File "/software.el7/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/shutil.py", line 417, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/software.el7/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/shutil.py", line 254, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/homefs/jc22p286/m2/sharc/MeCN/trj/batch4/SCR/Singlet_4/TRAJ_00101/master_1/ORCA.molden.input'

Also, I have realized that the path is a bit strange since the TRAJ_101 does not contain a master_1 folder, a mater folder is given in TRAJ_101/QM/SCRATCH
I assume this folder it is referring to.

Thank you

Best
Jakob

@maisebastian
Copy link
Collaborator

Unfortunately, I also do not know why ORCA crashes there. Is the crash reproducible? If it is reproducible, you might want to check at the ORCA forum. But this does not seem to be a SHARC-related issue.

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

Thank you for checking.

I have copied the input file in the master_1 folder to a folder outside of the sharc-folder structure, and there I am running the ORCA.inp. It works fine until now. Assuming that it is not orca (at least with no loaded modules) and not sharc, I can only imagine that it has to do with some loaded modules. Since I load a bundle before submitting SHARC, I would look at each loaded module itself and will also check if they interfere with orca! What modules are required to run sharc? I assume numpy. But are there any other packages you need to run Sharc?

Thank you

Best
Jakob

@maisebastian
Copy link
Collaborator

Hi Jakob,
we always use SHARC with an appropriate Anaconda environment. Please see the SHARC manual for details. Python modules should not interfere with ORCA.

@jakobcasa
Copy link
Author

Dear Sebastian

Sorry to disturb you again

I found out what was wrong: the libscalapack.so it was not found, so this issue was solved. unfortianalty

"STOP 1

#===================================================#
QM call was not successful, aborting the run.
Error code: 6656
#===================================================#"

Is there a list that does refer to these error codes?

Thank you

Best
Jakob

@maisebastian
Copy link
Collaborator

Hi Jakob,
the runQM.sh script returns the exit code of the corresponding interface. If you used SHARC_ORCA.py, you can look up all the exit codes in the code (https://github.com/sharc-md/sharc/blob/main/bin/SHARC_ORCA.py), just search for "sys.exit".

However, there is no exit code 6656 in SHARC_ORCA.py. I found an interesting note at https://stackoverflow.com/questions/59141319/calling-curl-command-from-c-returns-unexpected-error-codes-like-1792-and-6656 saying that exit codes might depend on endianness of your system. In that case, 6656 might actually mean 26, which is the SHARC_ORCA.py exit code for 'Could not find Orca version!'. The ORCA version is read from calling ORCA with a non-existing input file and reading the header from stdout. If the interface could not find the ORCA version, it is likely that ORCA could not be successfully started. Maybe some more libraries are missing?

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

Im sorry that this is not solved already.

You are correct, the orca path was not given correctly when I changed it (I did enter the path only leasing up to the bin folder and not to the orca executable itself.); I tried two-fold: first, I tried to start the simulation with the $SHARC/sharc.x, which resulted in:
"STOP 1
Warning: Decoherence correction turned off!
You have to give an initial state! "

And the second approch I did stay with $pysharc, which did get me no feedback at all

In both cases, no master_1 folder was created, and the error message in the QM.err was:

Traceback (most recent call last):
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5278, in
main()
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5242, in main
errorcodes = runjobs(schedule, QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 2888, in runjobs
saveFiles(WORKDIR, jobset[job])
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 3398, in saveFiles
saveMolden(WORKDIR, QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 3453, in saveMolden
shutil.copy(fromfile, tofile)
File "/software.el7/software/Python/3.10.4-GCCcore-11.3.0/lib/python3.10/shutil.py", line 417, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/software.el7/software/Python/3.10.4-GCCcore-11.3.0/lib/python3.10/shutil.py", line 254, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/storage/homefs/jc22p286/m2/sharc/MeCN/trj/batch3/SCR/Singlet_1/TRAJ_00071/master_1/ORCA.molden.input'

I also tried multiple libraries, all combinations between the SCiPy-bundle and anaconda3, and the needed shared library. Before the change I only used the SciPy bundle, and it worked.

On a side note, when I forgot the needed sharc library, the $pysharc path led to the error path of sharc.x, which also answered my original question.

Thank you for your continuous support

Best
Jakob

@maisebastian
Copy link
Collaborator

Well, the message "You have to give an initial state!" means that you are missing the "state" keyword in the SHARC input file. Maybe you did not call SHARC correctly?

Pysharc is intended for fast calculations (not with ab initio interfaces) and has optimized output for this purpose. Since you are using ORCA, you do not need to use pysharc.

I do not know what is the error message for the ORCA interface about. Seems like ORCA did not run successfully, and then the output files were missing.

Generally, I have to say that most of your problem descriptions are very hard to understand and that without the output files it is even harder. For the next messages, please be more clear what you did and attach relevant output files. I also recommend that you maybe just setup a new SHARC installation from scratch, with the recommended CONDA and without pysharc (if not needed), and that you follow the manual and use the setup scripts as described there.

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

Sorry that I was too unspecific; I will try to be more specific in the future. I will try to install SHARC from scratch once.

Best Jakob

@jakobcasa
Copy link
Author

Dear Sebastian

SHARC was set up correctly! Which I will tried to explain in detail in the following:

To check if the SHARC was set up correctly, I did setup a new traj according to the tutorial. I did everything up to the generation "ICOND", at this point, I have run the "sh all_run_init.sh", which did return an error: "
Traceback (most recent call last):
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5289, in
main()
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 5262, in main
QMin, QMout = getQMout(QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 4448, in getQMout
energies = getenergy(logfile, job, QMin)
File "/storage/homefs/jc22p286/applications/sharc/bin/SHARC_ORCA.py", line 4777, in getenergy
energies = {(gsmult, 1): gsenergy}
UnboundLocalError: local variable 'gsenergy' referenced before assignment "
The ICOND with this error I did put in the folder, which I shared with you last time (SHARC_traj_to_check), and their name are " ICOND_00000_org" and "ICOND_00002_org". Nevertheless, I did, for the following procedure, copy the respective folder from a case where the ICOND where calculated successfully (in the old cluster system) for the same molecule. Which does result in unphysical calculations, but since I wanted only to see if it starts compiling, this should be fine). I just in case placed these folders in the shared folder ("ICOND_00000_copied" and "ICOND_00002_copied"). Please note that you will find some files in the "copied" folders that may indicate that the calculation did not finish correctly, but it did; the files are due to re-runs of the calculations and then quitting the respective job. (I hope this was clear). With the copied ICOND, I set up the traj (according to the tutorial). And It did run successfully. The respective TRAJ is also in the shared folder (TRAJ_test_run). It has been running since two nights ago; I did kill the job manually; the number of steps that have been calculated is 6, with a restart at step 4, just to check if the restart does work (keyword: restart and restart_rerun_last_qm_step).

This is why I suspect that something is wrong, and I may need to restart the calculation (as you mentioned above already). However, since the TRAJs are relatively long (I had around 50 fs to go out of 300 fs) and I did 60 TRAJ, I will discuss this with my supervisor to decide what to do.)

How did you check the traj which I did send you, to check if the $ SHARC influenced it to $pysharc? I want to check the rest of the 60 traj.

I hope my explanation was clearer than the explanations before.

If not please let me know. and I can trie to be more specific.

Best
Jakob

@maisebastian
Copy link
Collaborator

Hi Jakob,
thanks for the detailed description. Especially the files were needed to figure out what is going on.

If you check ICOND_00000_org/SCRATCH/master_1/ORCA.log, you can find that the calculation has dramatic convergence problems:

--------------
SCF ITERATIONS
--------------
ITER       Energy         Delta-E        Max-DP      RMS-DP      [F,P]     Damp
               ***  Starting incremental Fock matrix formation  ***
  0 -35615.1060517919   0.000000000000394.36666048  3.01531475 21493.3862799 0.7000
  1 -87071444.5594802648-87035829.453428477049198.83626101  1.36618721 1202300.5480938 0.7000
  2 -301952697.0476108193-214881252.488130569458137.92137345  0.93618718 1773727.7488520 0.7000
  3 -625932960.1453249454-323980263.09771412611084.98853119  0.66044484 3385956.2867737 0.7000
  4 -1438610371.6273629665-812677411.48203802108859.16505030  0.52790050 4078743.3011233 0.7000
  5 -1660359386.9879968166-221749015.360633850098131.66046881  0.72299961 7685916.5921579 0.7000
  6 -2993904942.2015471458-1333545555.21355032920891.54656729  0.57628618 5263199.2708465 0.7000
  7 -3435008399.6011238098-441103457.399576663971221.19547574  1.20198976 3689205.5928032 0.7000
  8 -3777485240.9283695221-342476841.327245712280118.76386001  0.91476773 51064893.6005642 0.7000
  9 -4018004763.1934051514-240519522.265035629272122.65756587  0.99695842 86805212.5293205 0.7000
 10 -4188695640.4967331886-170690877.303328037262212.65792276  1.32735723 111444150.1878744 0.7000
 11 -4311734125.1667156219-123038484.669982433319211.57228861  1.40002752 128768073.1436279 0.7000
                               ***Turning on DIIS***
 12 -4396145603.6661262512-84411478.499410629272210.82712189  1.48359985 141020992.8799981 0.7000
 13 -4459266426.0492534637-63120822.38312721252434.16803958  0.19049923 149602252.1226413 0.7000
 14 -4485392472.5576725006-26126046.508419036865209.88605422  1.18375247 153112975.3676157 0.7000
 15 -5749663572.7526693344-1264271100.194996833801122.64952957  0.84585621 221084788.4627121 0.7000
 16 -6235168905.0237674713-485505332.271098136902215.98786141  1.26029528 417961317.7712792 0.7000
 17 -9037778953.1409435272-2802610048.117176055908168.35874700  1.36691666 1099860879.5595529 0.7000
 18 -13918865217.6605052948-4881086264.519561767578215.52501659  1.73555536 1982298152.8210719 0.7000
 19 -20869399225.3870544434-6950534007.726549148560248.18983563  2.06891636 2999570933.2970886 0.7000
               *** Restarting incremental Fock matrix formation ***
                                   *** Resetting DIIS ***

 WARNING: the maximum gradient error descreased on average only by a factor   0.8
          during the last 20 iterations

At this point, ORCA switches to the TRAH procedure, but that also seems to have a problem that I do not know in more detail:
[file orca_tools/Tool-Numint/qcgrid.cpp, line 4363, Process 2]: Error: the number of points read from the grid does not match the expectation

In ICOND_00002_org, similar things happen, just worse:

--------------
SCF ITERATIONS
--------------
ITER       Energy         Delta-E        Max-DP      RMS-DP      [F,P]     Damp
               ***  Starting incremental Fock matrix formation  ***
  0 -1422140807031984778152322834546269616002963580680506425146689205619677451932440609055939851965878271830046905633926067791206244288515885800354645281804209451427186645985923834774623412666039260844818993243198383556919296.0000000000   0.000000000000234.77408034  7.98879528 44540300894198739969127335682312427252828465895175637790064377491048344774776516495550737884549061051655353723211727852257503299715834643167191074721964943562582436265905246412340198841162135985483821955378742171860752924672.0000000 0.7000
  1 -3299759962662971322538756385781692593306105523747370164672281672479237319452210826153740251705063416721329035027633435466320403167970278970828889270611589580527777453770154194063416782399381545373627426421755299157592569282560.0000000000-3299758540522164597061352172561700313586264849593471866153755667027272973218243231365514429893686293346908133775358296266309202734173737778102967430565378290575905274410002794686384065828640422679796460449240180859067883847680.000000000000160.01419926  2.65541981 21607767691729452500003596011170210730308970823491361383623308680575625864860955593122254147444257931740308546664736310067293678464301944233476456162992519696529012906084027330306395687412771942206033755780348301129567019466752.0000000 0.7000

In ICOND_00002_copied, the ORCA.log shows that the calculation does converge, but the convergence behaviour is really strange:

  0  -6542.7383526079   0.00000000000023.24305084  0.08823965 1286.1273723 0.7000
  1 -11439.8638627264-4897.12551011853016.29858910  0.06188267 898.2928394 0.7000
  2 -14846.2390074315-3406.37514470512411.42665825  0.04342645 627.1697129 0.7000
  3 -17221.2011109369-2374.962103505370 8.00945989  0.03046578 438.1748289 0.7000
  4 -18879.5049811204-1658.303870183532 5.61341154  0.02136165 306.2842718 0.7000
  5 -20038.4778957365-1158.972914616130 3.93374018  0.01497342 214.1626271 0.7000
  6 -20848.9461420030-810.468246266410 2.75643530  0.01049408 149.7797347 0.7000
  7 -21415.9135523896-566.967410386682 1.93133561  0.00735441 104.7655295 0.7000
  8 -21812.6307004549-396.717148065269 1.35312754  0.00515421 73.2853592 0.7000
  9 -22090.2617404548-277.631039999909 0.94796822  0.00361255 51.2666529 0.7000
 10 -22284.5721561709-194.310415716096 0.66408828  0.00253239 35.8643044 0.7000
 11 -22420.5757600542-136.003603883262 0.46519763  0.00177558 25.0896016 0.7000
                               ***Turning on DIIS***
 12 -22515.7723961791 -95.196636124936 0.32586002  0.00124532 17.6019150 0.7000
 13 -22582.4075787180 -66.635182538917 0.22846745  0.00087626 12.3538164 0.7000
 14 -22629.0512624253 -46.643683707280 0.15999134  0.00061620  8.6543805 0.7000
 15 -22661.7015203092 -32.650257883910 0.11200213  0.00043298  6.0592679 0.7000
 16 -22684.5566222882 -22.855101979007 0.07840294  0.00030415  4.2417760 0.7000
 17 -22700.5551881576 -15.998565869424 0.05488011  0.00021376  2.9693296 0.7000
 18 -22711.7541974964 -11.199009338787 0.03841685  0.00015030  2.0786765 0.7000
 19 -22719.5935200503  -7.839322553824 0.02689246  0.00010574  1.4551893 0.7000
               *** Restarting incremental Fock matrix formation ***
                                   *** Resetting DIIS ***
 20 -22725.0799755444  -5.486455494112 0.01880224  0.00007436  1.0186978 0.7000
 21 -22728.9212658547  -3.841290310331 0.01318123  0.00005308  0.7149480 0.7000
 22 -22731.6101735035  -2.688907648793 0.00923042  0.00003790  0.5008615 0.7000
 23 -22733.4924132555  -1.882239752013 0.00646175  0.00002704  0.3506809 0.7000
 24 -22734.8099847832  -1.317571527710 0.00452330  0.00001931  0.2454980 0.7000
 25 -22735.7322875224  -0.922302739222 0.00316621  0.00001387  0.1718560 0.7000
 26 -22736.3779013504  -0.645613827939 0.00221639  0.00001002  0.1203086 0.7000
 27 -22736.8298323325  -0.451930982104 0.00517168  0.00002433  0.0842231 0.0000
 28 -22737.8843455812  -1.054513248771 0.00025397  0.00000424  0.0002102 0.0000
               *** Restarting incremental Fock matrix formation ***
                                   *** Resetting DIIS ***
 29 -22737.8848449166  -0.000499335307 0.00369693  0.00004109  0.0006652 0.0000
 30 -22737.8848583780  -0.000013461471 0.00030132  0.00000476  0.0000717 0.0000
 31 -22737.8848588294  -0.000000451419 0.00009754  0.00000117  0.0000966 0.0000
 32 -22737.8848589017  -0.000000072247 0.00006480  0.00000098  0.0000556 0.0000
 33 -22737.8848588130   0.000000088734 0.00004242  0.00000060  0.0000415 0.0000
 34 -22737.8848588167  -0.000000003787 0.00003209  0.00000045  0.0000149 0.0000

The bad convergence behaviour could be due to the bad initial orbitals given, which come from the corresponding ICOND_00000 calculation. Unfortunately, in ICOND_00000_copied, the ORCA.log does not show details (as you said, its a later attempt that failed).

Overall, I cannot really figure out what is the problem. My suspicion is that on your new cluster, there is some problem with the ORCA installation (maybe some library it depends on), but I cannot be sure. Do similar (SHARC-free) ORCA calculations show such problems on the new cluster? You might also need to be careful when using gbw files from one cluster as input for the other cluster.

About your new trajectory, I cannot find it on the cloud folder. So what I mostly did when I was checking was the following:

  1. Look at the plots from make_gnuscript.py (https://sharc-md.org/?page_id=50#sec:make_gnuscript.py):
  • is Etot nearly constant
  • are all Epots smooth/continuous
  • are the colors for the oscillator strengths smooth
  • is there unexpected/excessive hopping
  • are the electronic populations smooth and continuous
  1. Run the following command (N>=2):
    grep -A "Overlap" output.dat | grep -B 2 Overlap | grep E > test
    and then gnuplot "using 0:" to observe whether the wave function overlaps evolve smoothly. If something goes wrong, they are very sensitive and might exhibit random sign changes (especially off-diagonal elements that evolve around zero).

After looking again at your trajectory (00109), I would say that the hop at 23.00fs and the corresponding spike in the overlap matrix elements does look problematic. I did not check the last time, but this time I checked and this hop/spike coincides with one of the restarts, in particular a double restart according to the log file. However, I do not think that this is related to changing the $SHARC variable, which was the original question. Do you know at which time step the change in environment variable was done?

Best,
Sebastian

@jakobcasa
Copy link
Author

Dear Sebastian

Thank you for the explanation and for looking at the files. Until now, I have not observed any convergence issues when running ORCA (SHARC-Free). The convergence issue in the ICOND_org is because I did not provide any .gbw. (I only wanted to check if the TRAJ was started correctly.) Is the error (UnboundLocalError: local variable 'gsenergy' referenced before assignment ) due to the convergency issue?

Thank you as well for providing the points I need to go through for the rest of the 60 traj to check if they have any problems. That said, I believe I know why there is wired behavior at the 23fs: I included the nonequilibrated implicit solvent model until the 22.5 fs, and starting from 23 fs (till the end), I included the equilibrated one. This could be an explanation for the described behavior.

I don't precisely recall the timing at which the change of variable happened, but I know it happened before 28 Feb (with a 14-day period where it did not run), and considering what approximately 10,000 sec per step. With this very rough approximation, we would get around the 50th step. Otherwise, I can't give you any more specific time steps that I changed the path.

I will do the ICOND_org calculation with a .gbw file in the following and let you know, and I will upload the respective files as well. Maybe the "UnboundLocalError: local variable 'gsenergy' referenced before assignment" error will disappear.

Best Jakob

@maisebastian
Copy link
Collaborator

Hi,
the error message says basically that it could not find the ground state energy in the log file, which is due to the convergence problems. This is because ORCA does not provide non-zero exit codes (I guess we could implement a check whether the log-file has a successful termination message at the end).

What you say about the change at 23fs makes sense. I would expect significant differences between non-eq and eq solvation treatment in PCM, in particular for bright states. I think this is not a good idea to make changes to the electronic structure settings during a trajectory.
That being said, in my personal opinion it is almost never a good idea to run SHARC trajectories with implicit solvation. These methods intrinsically assume equilibration between the solvent and the electron density (of some state), which is just not the case in non-equilibrium nonadiabatic trajectories. We generally prefer QM/MM simulations in cases where solvent matters. If you nonetheless plan to use implicit solvation, you might want to consider setting the epsilon equal to epsilon infinity and then use equilibrium solvation. In this way, you include only the electronic response of the solvent (but not orientational and librational response), which can be assumed to be so fast that it is always in equilibrium with the molecule.

What you say about the time when you switched, as I said, I do not have the impression that the change in environment variable had any effect.

Best,
Sebastian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants