You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My network is like that:
Win 10 pcs <-> | samba , Ubuntu server 20.04 (initiator) | <---- 50 Mbit dsl ----> | Ubuntu server 20.04 (replica) ,samba | <-> Win 10 pcs |
The 2 ubuntu servers run only osync and samba. Osync syncs 2 folders between the 2 Ubuntu servers.
Initiator: In the night there is also a cron job that runs fsync (not o sync) to backup the initiator folder to another local disk .
Replica: In the night at Sundays there is also a cron job that runs fsync (not osync) to backup the replica folder to another local disk .
These are the only tasks that the 2 servers run.
It hangs and I have to restart the service manually. So i cannot leave it unattended.
To Reproduce
Unfortunately this happens randomly, from once per day to ten times per day, so i don't know how to help you reproduce it.
Expected behavior
Kill the procs that are still running, and then continue monitor the folder for changes.
** Deviated behavior**
It kills the procs that are still running but then hungs.
Logs
I run osync as a service, it works fine, but randomly it become unresponsive. And this is what log says when that happens:
.
.
.
TIME: 2999 - Current tasks still running with pids [3402351].
TIME: 3001 - (WARN):Max soft execution time exceeded for task [Sync] with pids [3402351].
TIME: 3004 - Sent mail using sendmail command without attachment.
TIME: 3059 - Current tasks still running with pids [3402351].
TIME: 3119 - Current tasks still running with pids [3402351].
TIME: 3179 - Current tasks still running with pids [3402351].
TIME: 3239 - Current tasks still running with pids [3402351].
TIME: 3299 - Current tasks still running with pids [3402351].
TIME: 3359 - Current tasks still running with pids [3402351].
TIME: 3419 - Current tasks still running with pids [3402351].
TIME: 3479 - Current tasks still running with pids [3402351].
TIME: 3539 - Current tasks still running with pids [3402351].
TIME: 3599 - Current tasks still running with pids [3402351].
TIME: 3601 - (ERROR):Max hard execution time exceeded for task [Sync] with pids [3402351]. Stopping task execution.
TIME: 3601 - (CRITICAL):Cannot create replica file list in [/var/fs/].
TIME: 3601 - (WARN):Command was [/usr/bin/rsync --rsync-path="(o_O) rsync" -rltD -8 --modify-window=2 --omit-dir-times --no-whole-file -p -o -g --executability --exclude ".osync_workdir" -e "/usr/bin/ssh -i /home/gaionaus/.ssh/id_rsa -p 22" --list-only [email protected]:"/var/fs/" 2> "/tmp/osync.treeList.target.error.3400238.20220118T052539.886827061" | (grep -E "^-|^d|^l" || :) | (awk '{$1=$2=$3=$4="" ;print substr($0,5)}' || :) | (awk 'BEGIN { FS=" -> " } ; { print $1 }' || :) | (grep -v "^.$" || :) | sort > "/tmp/osync.treeList.target.3400238.20220118T052539.886827061"].
TIME: 3601 - (WARN):Command output
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [Receiver=3.1.3]
TIME: 3601 - Task with pid [3402351] stopped successfully.
TIME: 3604 - Sent mail using sendmail command without attachment.
TIME: 3605 - (ERROR):osync finished with errors.
TIME: 3608 - Sent mail using sendmail command without attachment.
Tue Jan 18 06:25:47 UTC 2022 - (ERROR):osync child exited with error.
Tue Jan 18 06:25:47 UTC 2022 - #### Monitoring now.
Tue Jan 18 06:35:47 UTC 2022 - #### 600 timeout reached, running sync.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Tue Jan 18 06:35:47 UTC 2022 - osync 1.2 script begin.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)
Environment (please complete the following information):
Osync Version:
PROGRAM_VERSION=1.2
PROGRAM_BUILD=2017032101
IS_STABLE=yes
OS: ubuntu 20.04
Bitness: x64
Shell : bash
Additional context
It will stay on the last line for ever: "TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)"
And I have to RESTART the service manually .
What seems strange in the above log is that part. "/usr/bin/rsync --rsync-path="(o_O) rsync" "
It is like a pattern that it does not get replaced?
some settings from osync conf file :
RSYNC_OPTIONAL_ARGS="--modify-window=2 --omit-dir-times"
SOFT_MAX_EXEC_TIME=3000
HARD_MAX_EXEC_TIME=3600
KEEP_LOGGING=60
MIN_WAIT=120
MAX_WAIT=600
The text was updated successfully, but these errors were encountered:
That really sounds like a network problem.
Usually what happens is that a mounted drive over llost network will lead to a rsync zombie process.
The (o_O) part of the log is just a replacement of a security variable which should never be logged.
Do you have any supervision software on your systems ?
Hi and thanks for the reply.
Yes it is a network problem.
The osync runs just fine.
I have only observed some strange behavior on the deletion of files. Deleted files that it should not delete. The clocks of both servers are synced. Probably because of the bad network connection on the side of the target server. Incomplete file list creation because of the bad network connection?
Anyway I disabled the deletion of files and it syncs fine now for over 3 months.
Bad netwrok connection shouldn't be an issue for tasks like deletion, since it wouldn't just allow to run further, just like it did in your logs.
Anyway, I've never lost a single file with osync over the years, so I have no idea what your culprit could be.
My network is like that:
Win 10 pcs <-> | samba , Ubuntu server 20.04 (initiator) | <---- 50 Mbit dsl ----> | Ubuntu server 20.04 (replica) ,samba | <-> Win 10 pcs |
The 2 ubuntu servers run only osync and samba. Osync syncs 2 folders between the 2 Ubuntu servers.
Initiator: In the night there is also a cron job that runs fsync (not o sync) to backup the initiator folder to another local disk .
Replica: In the night at Sundays there is also a cron job that runs fsync (not osync) to backup the replica folder to another local disk .
These are the only tasks that the 2 servers run.
It hangs and I have to restart the service manually. So i cannot leave it unattended.
To Reproduce
Unfortunately this happens randomly, from once per day to ten times per day, so i don't know how to help you reproduce it.
Expected behavior
Kill the procs that are still running, and then continue monitor the folder for changes.
** Deviated behavior**
It kills the procs that are still running but then hungs.
Logs$1 }' || :) | (grep -v "^.$ " || :) | sort > "/tmp/osync.treeList.target.3400238.20220118T052539.886827061"].
I run osync as a service, it works fine, but randomly it become unresponsive. And this is what log says when that happens:
.
.
.
TIME: 2999 - Current tasks still running with pids [3402351].
TIME: 3001 - (WARN):Max soft execution time exceeded for task [Sync] with pids [3402351].
TIME: 3004 - Sent mail using sendmail command without attachment.
TIME: 3059 - Current tasks still running with pids [3402351].
TIME: 3119 - Current tasks still running with pids [3402351].
TIME: 3179 - Current tasks still running with pids [3402351].
TIME: 3239 - Current tasks still running with pids [3402351].
TIME: 3299 - Current tasks still running with pids [3402351].
TIME: 3359 - Current tasks still running with pids [3402351].
TIME: 3419 - Current tasks still running with pids [3402351].
TIME: 3479 - Current tasks still running with pids [3402351].
TIME: 3539 - Current tasks still running with pids [3402351].
TIME: 3599 - Current tasks still running with pids [3402351].
TIME: 3601 - (ERROR):Max hard execution time exceeded for task [Sync] with pids [3402351]. Stopping task execution.
TIME: 3601 - (CRITICAL):Cannot create replica file list in [/var/fs/].
TIME: 3601 - (WARN):Command was [/usr/bin/rsync --rsync-path="(o_O) rsync" -rltD -8 --modify-window=2 --omit-dir-times --no-whole-file -p -o -g --executability --exclude ".osync_workdir" -e "/usr/bin/ssh -i /home/gaionaus/.ssh/id_rsa -p 22" --list-only [email protected]:"/var/fs/" 2> "/tmp/osync.treeList.target.error.3400238.20220118T052539.886827061" | (grep -E "^-|^d|^l" || :) | (awk '{$1=$2=$3=$4="" ;print substr($0,5)}' || :) | (awk 'BEGIN { FS=" -> " } ; { print
TIME: 3601 - (WARN):Command output
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [Receiver=3.1.3]
TIME: 3601 - Task with pid [3402351] stopped successfully.
TIME: 3604 - Sent mail using sendmail command without attachment.
TIME: 3605 - (ERROR):osync finished with errors.
TIME: 3608 - Sent mail using sendmail command without attachment.
Tue Jan 18 06:25:47 UTC 2022 - (ERROR):osync child exited with error.
Tue Jan 18 06:25:47 UTC 2022 - #### Monitoring now.
Tue Jan 18 06:35:47 UTC 2022 - #### 600 timeout reached, running sync.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Tue Jan 18 06:35:47 UTC 2022 - osync 1.2 script begin.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)
Environment (please complete the following information):
Osync Version:
PROGRAM_VERSION=1.2
PROGRAM_BUILD=2017032101
IS_STABLE=yes
Additional context
It will stay on the last line for ever: "TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)"
And I have to RESTART the service manually .
What seems strange in the above log is that part. "/usr/bin/rsync --rsync-path="(o_O) rsync" "
It is like a pattern that it does not get replaced?
some settings from osync conf file :
RSYNC_OPTIONAL_ARGS="--modify-window=2 --omit-dir-times"
SOFT_MAX_EXEC_TIME=3000
HARD_MAX_EXEC_TIME=3600
KEEP_LOGGING=60
MIN_WAIT=120
MAX_WAIT=600
The text was updated successfully, but these errors were encountered: