Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py client: drop delphi_utils requirement, log to stderr instead #1486

Merged
merged 2 commits into from
Jul 9, 2024

Conversation

melange396
Copy link
Collaborator

This drops the requirement for the bloated delphi_utils package in the python client, and replaces the delphi_utils.logger with printing to STDERR instead.

Addresses #1467

I havent tested this yet, and i imagine it is going to fail CI.

def log(evt, **kwargs):
kwargs['event'] = evt
kwargs['timestamp'] = time.strftime("%Y-%m-%d %H:%M:%S %z")
return sys.stderr.write(str(kwargs) + "\n")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might need a flush() call to keep it synced

Suggested change
return sys.stderr.write(str(kwargs) + "\n")
sys.stderr.write(str(kwargs) + "\n")
sys.stderr.flush()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: we don't need the flush().

>>> sys.stderr
<_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
>>> sys.stderr.line_buffering
True

io.TextIOWrapper says this about line_buffering=True

If line_buffering is True, flush() is implied when a call to write contains a newline character or a carriage return.

Copy link

@dshemetov
Copy link
Contributor

Seems like a simple enough workaround 👍 .


I was curious to learn a little about stderr and structlog. A few basic notes I found helpful, for future reference:

  • get_structured_logger() is set to always use logger.StreamHandler() (I'm actually not sure if the branch with logging.FileHandler() is ever hit, since I think system_logger.handlers always has the StreamHandler() as per the config above)
  • logger.StreamHandler() defaults to stderr as its output
  • so I think the end result is that structlog does our formatting and string rendering and the standard logging library then forwards that string to stderr (structlog has some suggested configurations and ours seems closest to this one)

@melange396 melange396 requested review from rzats and dshemetov June 29, 2024 04:06
@melange396
Copy link
Collaborator Author

@dshemetov youre right, the FileHandler stuff is basically unreachable. One of the googlebros borked it, but it wasnt much better before that anyway... I guess we are lucky that we never needed to directly write to a log file with this!

@dshemetov
Copy link
Contributor

dshemetov commented Jun 29, 2024

Ah well this actually answers a question that came up in a conversation with @aysim319: I was wondering whether any of these get_structured_logger calls actually depended on that filename argument in any important way and looks like the answer is no. Since it's been like this for 3 years, we must not rely on those file logs. On the plus side, that suggests that we can remove that argument, so the logger doesn't depend on the params file, so it can be pulled out of the run function, set as a global in each module, and then we can stop passing the logger around from function to function.

Update: alright, so the good news is that the FileHandler branch is actually reached (which allows indicator-specific log files). The way this works out is that as long as the name variable is non-empty in logging.getLogger(), the returned logger is a child logger which don't have handlers by default:

import logging

logging.basicConfig(
    format="%(message)s",
    level=logging.DEBUG,
    handlers=[logging.StreamHandler()] # This is the default handler
)
# Root logger gets the handler
root_logger = logging.getLogger()
assert root_logger.handlers
# Named loggers have no handlers by default
named_logger = logging.getLogger("named")
assert not named_logger.handlers

@dshemetov dshemetov mentioned this pull request Jul 1, 2024
4 tasks
Copy link
Contributor

@dshemetov dshemetov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look good, new log() function with str({...}) looks good. Low risk, non-user facing change, removes lots of dependencies.

def log(evt, **kwargs):
kwargs['event'] = evt
kwargs['timestamp'] = time.strftime("%Y-%m-%d %H:%M:%S %z")
return sys.stderr.write(str(kwargs) + "\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: we don't need the flush().

>>> sys.stderr
<_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
>>> sys.stderr.line_buffering
True

io.TextIOWrapper says this about line_buffering=True

If line_buffering is True, flush() is implied when a call to write contains a newline character or a carriage return.

@melange396 melange396 merged commit 6a3f683 into dev Jul 9, 2024
7 checks passed
@melange396 melange396 deleted the py_client_log_to_stderr branch July 9, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants