Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch issues with pvmonitors in EPICS base #23

Merged
merged 2 commits into from
May 31, 2024

Conversation

simon-ess
Copy link
Contributor

If you have a waveform-type array (aai, waveform, etc.) and have caputlog loaded and initialised, then the IOC will no longer send out pvmonitors on changes (though it still sends on CA monitors). This seems to be due to a bug in PVAccessCPP about caching of PV channels: when we hit the caPutLog callback code on a put, we call get_array_info from the record rset code, which overwrites the requested field with the internal array buffer. This means that we modify the original pchan in memory which later causes a field comparison in the db event code (db_post_events) to fail.

This patch is a minimal fix to caPutLog to get rid of the problem. While the actual issue seems to be a PVAccessCPP one (it seems that the latest version of PVXS does not have this problem), it seems simpler to fix it here instead of digging deeply into code that ultimately will be no longer in use in a few years.

@anjohnson
Copy link
Member

Seems reasonable, the only slight concern I have is that I don't remember whether using memcpy() to duplicate a dbAddr is kosher or not; it should be easy to confirm that though.

Did you find something similar in autosave/saverestore?

@mdavidsaver
Copy link
Contributor

mdavidsaver commented Mar 30, 2024

Is this PR meant to fix #22 and/or epics-base/pva2pva#60 ?

@mdavidsaver
Copy link
Contributor

... I don't remember whether using memcpy() to duplicate a dbAddr is kosher or not; ...

I think this is fine.

It should be sufficient to save/restore only paddr->pfield as is done by dbGet(). https://github.com/epics-base/epics-base/blob/5dfc6caf3c898b213c8458f4766152a1b5a3e477/modules/database/src/ioc/db/dbAccess.c#L909

@simon-ess
Copy link
Contributor Author

simon-ess commented Apr 2, 2024

Is this PR meant to fix #23 and/or epics-base/pva2pva#60 ?

Yes. Even though I believe that the issues is fundamentally a PVAccessCPP (or one of the related projects) issue, this seemed to be the simplest fix that was relatively clean and would not necessitate a new (patched) release of EPICS base on our end.

@mdavidsaver
Copy link
Contributor

For better or worst, the semantics of dbAddr::pfield allow record support get_array_info() to overwrite since epics-base/epics-base@4f6040d (my doing). This was part of the change to allow waveformRecord device support to swap array buffers in without copying (epics-base/epics-base@fb21c7c). The need for callers to save/restore pfield was not properly documented.

Of course, the asLib API is also not well documented. With QSRV1 and 2, I don't think I considered all of the implications of passing dbChannel* into asTrapWriteWithData(). Especially since this call to get_array_info() happens on a different thread, with no synchronization (aka. there is a race condition).

... I believe that the issues is fundamentally a PVAccessCPP ...

So where to pin the "blame"?

Probably with Base. asTrapWriteMessage::serverSpecific is a void*, although practically a dbChannel* (previously a dbAddr*?). DBADDR* was originally immutable, but then started being mutable.

So I think asLib.h needs to change so that asTrapWriteBeforeWithData() makes a "defensive" copy of the dbChannel* prior to queuing. This may also mean facing some tough choices about type safety vs. API backwards compatibility...

@ericonr
Copy link

ericonr commented Apr 5, 2024

Per https://epics.anl.gov/tech-talk/2024/msg00481.php, we noticed issues with caPutLog for Channel Access as well; it manifested quite differently from #22, though.

This PR worked for us, so I'd like to know if I can do anything to help merge it (and ideally tag a release) :)

@simon-ess
Copy link
Contributor Author

@anjohnson I have updated release notes; if you have no objections I will merge this in.

@ericonr
Copy link

ericonr commented May 22, 2024

@simon-ess do you think you could update the commit message and release notes to reflect that this issue is not specific to PVAccess?

If you have a waveform-type array (aai, waveform, etc.) and have caputlog
loaded and initialised, then the IOC will no longer send out pvmonitors
on changes (though it still sends on CA monitors). This seems to be due
to a bug in PVAccessCPP about caching of PV channels: when we hit the
caPutLog callback code on a put, we call get_array_info from the record
rset code, which overwrites the requested field with the internal array
buffer. This means that we modify the original pchan in memory which
later causes a field comparison in the db event code (db_post_events) to
fail.

This patch is a minimal fix to caPutLog to get rid of the problem. While
the actual issue seems to be a PVAccessCPP one (it seems that the latest
version of PVXS does not have this problem), it seems simpler to fix it
here instead of digging deeply into code that ultimately will be no
longer in use in a few years.

Note that while the issue was first observed in the context of PVMonitors,
it does seem to be more generic than that and is thus not likely only an
issue with PVAccessCPP; see
https://epics.anl.gov/tech-talk/2024/msg00481.php for another example of
a related issue that this change fixes.

Co-authored-by: Lucas A. M. Magalhães <[email protected]>
Co-authored-by: Gabriel Fedel <[email protected]>
Co-authored-by: Mateusz Nabywaniec <[email protected]>
@simon-ess
Copy link
Contributor Author

simon-ess commented May 30, 2024

@simon-ess do you think you could update the commit message and release notes to reflect that this issue is not specific to PVAccess?

Hi @ericonr - I have updated the commit message and release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants