-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
write lock stuck -- tried fsync() #25
Comments
I'm willing to kick the tires on your proposed changes on x86-64 hardware. Can you post DIFFs? |
The diffs I posted have a lot of debug code and also some forced overrides on some parameters. Mostly just try the lines with the fsync(). Thanks! |
Please provide more information:
Thanks, |
Hi Janusz,
1) steps to reproduce:
a) Down load repository
b) Compile on Raspberry Pi 4, 4Gb.
c) connect to cm11a through usb to serial
d) Start the heyu relay.
e) try various commands
2) I don't recall the exact error messages, basically locked ttyUSB1.
3) Permanent failure, requires manual deletion of the lock file.
The raspberry Pi 4 is fast and multi-processor/multi threaded, Also the USB->RS232 is a bit asynchronous, therefore some of this old code which expects to set a lock and then clear it often has the set and clear out of order, this is why my addition of the fsync() call after each lock helped to resolve the issue.
I forgot to mention that instead of running on the raspberry pi SD card I am running on a USB3->Sata high speed Sata SSD. Also adds speed and asynchronous nature to the processing and IO.
I also as previously suggested sent diffs of all the places I added the Fsync().
I Think the problem is just out of order IO and I think the fsync() calls around the lock are the solution. Since I can not compile and test on other platforms, And I had no problems before upgrading the pentium computer to a raspberry pi, I just think the code had this IO order weakness. If the fsync calls can be added in a platform independent way I think it will be a robust fix.
Thanks,
James
… On Mar 26, 2021, at 8:17 PM, Janusz Krzysztofik ***@***.***> wrote:
Please provide more information:
steps to reproduce,
error messages displayed when further aceess is blocked (please run Heyu in verbose mode -- command line option -v),
whether further access is blocked temporarily or permanently (requires manual recovery).
Thanks,
Janusz
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF77RS2DEH6VXNITOFPXPDDTFUP3FANCNFSM4UDZXFOA>.
|
Hi James and all, I've just pushed a new branch with locking fixes (c67966a) for you to try. Since I've no access to a test environent at the moment, the patch has been only compile tested on Linux but I hope it can resolve old standing issues with stale lock files. If you'd like to comment the patch itself, not only results of your testing, you can add your comments to a pull request #33 which I also created. Thanks, |
Thanks I will try to get to it in the next few days.
… On Apr 3, 2021, at 6:26 AM, Janusz Krzysztofik ***@***.***> wrote:
Hi James and all,
I've just pushed a new branch with locking fixes (c67966a <c67966a>) for you to try. Since I've no access to a test environent at the moment, the patch has been only compile tested on Linux but I hope it can resolve old standing issues with stale lock files.
If you'd like to comment the patch itself, not only results of your testing, you can add your comments to a pull request #33 <#33> which I also created.
Thanks,
Janusz
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF77RS3Q5RQPJHHGJSELGL3TG3UODANCNFSM4UDZXFOA>.
|
While trying to solve #46, I tried this new branch but it wouldn't start on my raspi4 running raspian. The output looks like this:
ad infinitum Any thoughts on how to move past this? |
Sorry I have not upgraded since solving my issue, I do occasionally get IO locks and need to delete the log file still.
Maybe its time to look at the underpinning of the file locks since my patch was just a hack.
… On Nov 4, 2022, at 2:26 PM, Dean Blackketter ***@***.***> wrote:
While trying to solve #46 <#46>, I tried this new branch but it wouldn't start on my raspi4 running raspian. The output looks like this:
Version:2.11-rc3
Searching for '/home/dean/.heyu/x10config'
Searching for '/usr/local/etc/heyu/x10.conf'
Found configuration file '/usr/local/etc/heyu/x10.conf'
Heyu directory /usr/local/etc/heyu/ is writable.
Reading Heyu configuration file '/usr/local/etc/heyu/x10.conf'
Trying to lock (/usr/local/var/lock/LCK..heyu.write.ttyUSB0)
lockpid: Checking for file '/usr/local/var/lock/LCK..LCK..'
lockpid: Checking for file '/usr/local/var/lock/LCK..LCK..'
lockpid: Checking for file '/usr/local/var/lock/LCK..LCK..'
lockpid: Checking for file '/usr/local/var/lock/LCK..LCK..'
lockpid: Checking for file '/usr/local/var/lock/LCK..LCK..'
lockpid: Checking for file '/usr/local/var/lock/LCK..LCK..'
ad infinitum
Any thoughts on how to move past this?
—
Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF77RS7MXUMQENSET57YX4LWGVIMTANCNFSM4UDZXFOA>.
You are receiving this because you authored the thread.
|
I've just updated my pull request #33 with a fix pushed to the topic/locking branch. Since I have no access to hardware, it is only compile tested. Please download it from https://github.com/HeyuX10Automation/heyu/archive/refs/heads/topic/locking.zip, build, test and report. |
Hi,
I know this is old code, and not much maintenance, but I thought I would share my issue and potential solution.
I am using a raspberry pi 4, 4GB, running from a fast USB SSD, (not the SD card) PL2303 USB-> serial adapter through a USB3 hub connected to a usb3 port on the pi. This is replacing a long line of x86 linux machines I have been using with heyu in the past, so I was an experienced long term user and this was an issue with the new server hardware.
My issue is that occasionally (frequently) heyu would be stuck with LCK..heyu.write.ttyUSB1 left in place blocking further access.
I tried to debug and look through the code for locking and unlocking -- even added a few more debug statements, of course with the debug statements inlace I did not have a failure again which made me think it could be thread timing since this is a fast machine with multiple cores that it could be a sync between the io and the threads. I have added fsync() calls after the writes to the lock file and the writes to the serial port:
ie: in tty.c
tty.c- if ( (f = fopen(ttydev, "w")) != NULL ) {
tty.c- fprintf(f, " %d\n", (int)getpid() );
tty.c: fsync(f);
and many places in x10aux.c read.c and write.c code like below. So far no hangs.
xread.c- if ( (!i_am_relay || i_am_aux) && write(sptty, (char *)lbuf , 4) != 4 )
xread.c- {
xread.c: fsync(sptty);
xwrite.c- ignoret = write(sptty, (char *)outbuf, size + 4);
xwrite.c: fsync(sptty);
Since I don't know about the portability and robustness of this fix, I am not proposing to pull my changes at this time, but I wanted to share my findings.
Thanks!
The text was updated successfully, but these errors were encountered: