Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]: add a logging handler callback #2087

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

williamcroberts
Copy link
Member

@williamcroberts williamcroberts commented Jun 7, 2021

[RFC]: add a logging handler callback

Just some code to guide a discussion on how to add a logging callback
routine to make applications and bindings uses the various apis easier
to integrate into different logging mechanisms. This would be nice for
the Python bindings.

Changed:
- Updated the API to call with various params over just a single string.
- Added SYS and ESYS routines.

TODO:
1. Add FAPI support
2. Test
3. Fiddle with names, style and locations in the code base

@williamcroberts williamcroberts marked this pull request as draft June 7, 2021 17:45
@williamcroberts
Copy link
Member Author

@joholl @AndreasFuchsSIT something to guide discussion with

@joholl
Copy link
Collaborator

joholl commented Jun 8, 2021

This callback handler is too low-level, imo. It handles logging on string level, i.e. you cannot change the format. Also, on python level you would need to parse the logs again to extract file name, line number etc.

For reference, a python log record needs to know: module name, log level, file name, line number, function name and message (see class LogRecord).

Therefore, I'd suggest that the callback handler should look something like this. I guess all of those are null-terminated strings, including the message, so no size needed.

typedef void (*TSS2_LOG_HANDLER)(
            char      const *log_module,
            log_level        loglevel,
            char      const *file_name,
            uint32_t         line_no,
            char      const *function_name
            char      const *message);

In this case, we'd need to move the formatting from snprintf(NULL, 0, "%s:%s:%s:%d:%s() %s \n", log_strings[loglevel], module, file, line, func, msg); into the default log handler.

@AndreasFuchsTPM
Copy link
Member

Agreed with @joholl

The big problem I see here however is when you bind against two of the tss2 libraries at the same time.
I.e. we cannot have a global function call, since linking will break this.

Options I see:

  1. Have a libtss2-log.so where the global loghandler logic is kept
  2. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for global state without a context parameter
  3. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for per context setting
  4. Allow a file-descriptor for log output via an environment variable
  5. other ideas ?

@joholl
Copy link
Collaborator

joholl commented Jun 8, 2021

@AndreasFuchsSIT Good point! Tbh, those are all somewhat ugly...

Btw, I already added support for logging to an fd (#1976). Additionally, I implemented a PoC on python side (tpm2-software/tpm2-pytss#164) but parsing logs on python level in a separate thread is really not the most elegant solution.

I have another suggestion: let's not add a callback mechanism but just a logging function as a weak symbol. I think we should be able to overwrite it, then.

@AndreasFuchsTPM
Copy link
Member

@joholl I was referring to a custom fd > stderr, e.g. fd = createPipe(); setenv("TSS2_LOGFILE", "fd:%i" % fd);

Could you explain the weak symbol thing and how to use it here to me ? Is it something like a dlsym(dlopen(NULL), "mysymbol"); ? And is it C99 compliant ?

@williamcroberts
Copy link
Member Author

williamcroberts commented Jun 8, 2021

Agreed with @joholl

The big problem I see here however is when you bind against two of the tss2 libraries at the same time.
I.e. we cannot have a global function call, since linking will break this.

Sorry I actually forgot to add some of the files (added now), but that I don't think is correct. Each library is statically linked to the logging code, which means each instance of the library has it's own global state. Ie, a call to changing to Sys will not modify Esys.

I did a test program just to confirm my understanding and we never see duplicate functions in the output, which I think confirms my understanding:

Code

#include <stdio.h>
#include <tss2/tss2_sys.h>
#include <tss2/tss2_esys.h>

void sys_handler(const char *msg, size_t size) {
	printf("SYS: %s\n", msg);
}

void esys_handler(const char *msg, size_t size) {
	printf("ESYS: %s\n", msg);
}

int main ()
{

    printf("Setting NEW Sys Handler: %p\n", sys_handler);
    TSS2_LOG_HANDLER old = Tss2_Sys_SetLogHandler(sys_handler);
    printf("SYS Old Handler: %p\n", old);
    
    printf("NEW ESys Handler: %p\n", esys_handler);
    old = Esys_SetLogHandler(esys_handler);
    printf("ESYS Old Handler: %p\n", old);

    return 0;
}

Building:

gcc -o go a.c `pkg-config --libs tss2-esys tss2-sys`

Output

./go
Setting NEW Sys Handler: 0x55f12dc3f189
SYS Old Handler: 0x60ef5c93
NEW ESys Handler: 0x55f12dc3f1b8
ESYS Old Handler: 0x7f9760f9a9e3

Options I see:

  1. Have a libtss2-log.so where the global loghandler logic is kept

That would then break it and the logic as we need a non-shared lib here IIUC.

  1. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for global state without a context parameter

Yep, I added in this update, at least for Esys and Sys so folks can play with it more.

  1. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for per context setting

I would say no to this one. The global library logger would then get used for context creation logs, and then another one used after that. Seems confusing, and can be added later if a user requests it.

  1. Allow a file-descriptor for log output via an environment variable

fd's aren't portable, then we end up in windows land again.

  1. other ideas ?

I like exposing the the more raw logging callback, I just tried to keep it dead simple, but as @joholl mentions, puts us into grepping strings for level.

@lgtm-com
Copy link

lgtm-com bot commented Jun 8, 2021

This pull request introduces 1 alert when merging 8e8bd9c into 473f17d - view on LGTM.com

new alerts:

  • 1 for Implicit function declaration

@williamcroberts
Copy link
Member Author

@joholl I was referring to a custom fd > stderr, e.g. fd = createPipe(); setenv("TSS2_LOGFILE", "fd:%i" % fd);

Could you explain the weak symbol thing and how to use it here to me ? Is it something like a dlsym(dlopen(NULL), "mysymbol"); ? And is it C99 compliant ?

A weak symbol allows the linker to override it at link time. Any library following the declared weak symbol will override it. Weak symbols can be construed as a security issues, however if your linking it, you should ensure its a legit library. But beacuse of this I generally don't use weak symbols for that reason and portability.

Weak symbols are compiler attribute, they are not covered in C99. The only linking visibility covered in C, AFAIK, is static and extern storage. They also require support of the linker and file format. I think Mac's DYN supports them, but I don't think windows PE does; obviously ELF does.

@williamcroberts williamcroberts force-pushed the callback-rfc branch 2 times, most recently from a28d550 to f608711 Compare June 8, 2021 17:16
Just some code to guide a discussion on how to add a logging callback
routine to make applications and bindings uses the various apis easier
to integrate into different logging mechanisms. This would be nice for
the Python bindings.

Changed:
- Updated the API to call with various params over just a single string.
- Added SYS and ESYS routines.
- Updated Visual Studios build configs.

TODO:
1. Add FAPI support
2. Test
3. Fiddle with names, style and locations in the code base

Signed-off-by: William Roberts <[email protected]>
@AndreasFuchsTPM
Copy link
Member

@joholl I'd vote against weak symbols then, because we are C99 everywhere else and we want to stay compatible with all those old compilers for embedded.

@williamcroberts I'd favor the more complex callbacks as well

The thing about this option 2 is that if the tss is used from two different places (e.g. main application and a sub-library) they might be fighting about the loghandler setting. But I guess this case is negligible. So I like the current approach in principle.

The other thing is that we cannot set the loghandler for tcti modules at all using this approach if we go through the tctildr. I guess tctildr needs to forward it loghandler to the tcti modules upon loading them.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

Also Option 1 would allow a single setting of the logger instead of having to set it for each library; In case of pkcs11 that would mean setting the logger 4 times (for tcti(ldr), for mu, for esys and for fapi).

@joholl
Copy link
Collaborator

joholl commented Jun 9, 2021

I agree that weak symbols are not the ideal solution in terms of portability.

@williamcroberts

That [Option 1.] would then break it and the logic as we need a non-shared lib here IIUC.

Why do we need a non-shared lib, here? Similar to the mu lib, we could just link against libtss2-log.so, couldn't we? That would have the advantage of having a single logging handler and a single callback setter, as pointed out by @AndreasFuchsSIT.

@AndreasFuchsSIT With option 2.: since we also support linking against TCTI modules directly (as opposed to using TctiLdr), we'd also need a loghandler setter for the TCTI modules. As you said, the mu library (and maybe rc-decode?) needs logging, as well, and therefore their own loghandlers.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

No, I think if a TCTI module is dlopen-ed and depends on libtss2-logging.so, it loads its own instance of libtss2-logging.so, see the man pages:

If the library has dependencies on other shared libraries, then these are also automatically loaded by the dynamic linker using the same rules. (This process may occur recursively, if those libraries in turn have dependencies, and so on.)
...
External references in the library are resolved using the libraries in that library's dependency list and any other libraries previously opened with the RTLD_GLOBAL flag. If the executable was linked with the flag "-rdynamic" (or, synonymously, "--export-dynamic"), then the global symbols in the executable will also be used to resolve references in a dynamically loaded library.

But maybe if we dlopen libtss2-logging.so, it could work? I think this needs to be experimented with.

@williamcroberts
Copy link
Member Author

I agree that weak symbols are not the ideal solution in terms of portability.

@williamcroberts

That [Option 1.] would then break it and the logic as we need a non-shared lib here IIUC.

Why do we need a non-shared lib, here? Similar to the mu lib, we could just link against libtss2-log.so, couldn't we? That would have the advantage of having a single logging handler and a single callback setter, as pointed out by @AndreasFuchsSIT.

I see what he was saying now, but you would only have one logging call back for all libraries, or have to instrument that library for all libraries that want a dedicated logger.

@AndreasFuchsSIT With option 2.: since we also support linking against TCTI modules directly (as opposed to using TctiLdr), we'd also need a loghandler setter for the TCTI modules. As you said, the mu library (and maybe rc-decode?) needs logging, as well, and therefore their own loghandlers.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

Yes, the dynamic linker looks through the DT_NEEDED section (IIRC the section name correctly) and if not loaded, loads them.

No, I think if a TCTI module is dlopen-ed and depends on libtss2-logging.so, it loads its own instance of libtss2-logging.so, see the man pages:

If the library has dependencies on other shared libraries, then these are also automatically loaded by the dynamic linker using the same rules. (This process may occur recursively, if those libraries in turn have dependencies, and so on.)
...
External references in the library are resolved using the libraries in that library's dependency list and any other libraries previously opened with the RTLD_GLOBAL flag. If the executable was linked with the flag "-rdynamic" (or, synonymously, "--export-dynamic"), then the global symbols in the executable will also be used to resolve references in a dynamically loaded library.

But maybe if we dlopen libtss2-logging.so, it could work? I think this needs to be experimented with.

Dealing with needeing multiple instances comes back to the trick you used to deal with libtpms in #1992.

@williamcroberts
Copy link
Member Author

williamcroberts commented Jun 9, 2021

@williamcroberts I'd favor the more complex callbacks as well

The thing about this option 2 is that if the tss is used from two different places (e.g. main application and a sub-library) they might be fighting about the loghandler setting. But I guess this case is negligible. So I like the current approach in principle.

This has never been an issue with libselinux, so I doubt it will be one in practice. If someone does need to work around this they can use the dlopen trick in #1992

The other thing is that we cannot set the loghandler for tcti modules at all using this approach if we go through the tctildr. I guess tctildr needs to forward it loghandler to the tcti modules upon loading them.

Yes good point.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

I see what your saying now where we would just have one logger for all things, which is fine by me. I just assumed that we would want granularity here, but that's not a hard requirement and could be dropped. I think we would want to ensure that liblog is not considered a public library. disregard that statement.

Also Option 1 would allow a single setting of the logger instead of having to set it for each library; In case of pkcs11 that would mean setting the logger 4 times (for tcti(ldr), for mu, for esys and for fapi).

Yeah I'm starting to come around to option 1.

@williamcroberts
Copy link
Member Author

But maybe if we dlopen libtss2-logging.so, it could work? I think this needs to be experimented with.

Generally its not wise to dlopen things because then explicit dependencies cannot be discovered by interrogating the libraries dependencies in the DT_NEEDED section. IIRC, folks from Gentoo often gripe about this, something to do with their package manager. So I try to avoid dlopen unless a specific need arises.

@AndreasFuchsTPM
Copy link
Member

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter.
That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

@williamcroberts
Copy link
Member Author

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter.
That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

Yeah I thought about that as well, was unsure if we could tolerate a change like that. But it would be a separate API so all existing applications just behave the same and if anyone wants this behavior they can migrate.

@williamcroberts
Copy link
Member Author

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter.
That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

This seems not to work because a lot of the API that logs doesn't have a state associated with it, like libmu.

@AndreasFuchsTPM
Copy link
Member

AndreasFuchsTPM commented Aug 6, 2021

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter.
That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

This seems not to work because a lot of the API that logs doesn't have a state associated with it, like libmu.

Agreed.

per-API global setters are fine with me.

P.S. Who shall override whom ? Function overrides Env-Parsing or other the other way ?

@joholl
Copy link
Collaborator

joholl commented Aug 6, 2021

I'd opt for env overwrites function because it is far easier to just change the env. var. than to change the code and recompile when debugging. The env mechanism should only be used for debugging, anyway, imo.

@williamcroberts
Copy link
Member Author

The ENV var tweaks the level, and if the message is to be logged it goes to the callback registered by the application or to the default stderr stream.

@joholl
Copy link
Collaborator

joholl commented Aug 9, 2021

Sure. My expectation was that if we create a logging API with a setter for logging callbacks, we also add a call for setting the verbosity. I guess that was also what @AndreasFuchsSIT meant, right?

@williamcroberts
Copy link
Member Author

Sure. My expectation was that if we create a logging API with a setter for logging callbacks, we also add a call for setting the verbosity. I guess that was also what @AndreasFuchsSIT meant, right?

Ahh " we also add a call for setting the verbosity", ahh ok. With the env var do we really need a set log level function? When would we use that over the env var? If we had something like a --debug command line option we could just add to env.

gnomesysadmins pushed a commit to GNOME/gnome-remote-desktop that referenced this pull request Aug 5, 2022
There is no way yet[1] to control how much tss2 logs and to where, from the C
API. The problem with that is if there is no TPM 2.0 module, it'll very
verbosely tell about it via ERROR and WARNING log entries directly to
stderr, which isn't really what we want.

Change this by setting the env var controlling the log target to
/dev/null unless the tpm debug key is enabled.

[1] tpm2-software/tpm2-tss#2087
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants