Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Add LinuxMemoryChecker check/warning to ensure system-mem-limit-gb is reasonably set #24149

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

minhancao
Copy link
Contributor

@minhancao minhancao commented Nov 26, 2024

Description

Add LinuxMemoryChecker check and warning to ensure system-memory-gb < system-mem-limit-gb < actual total memory capacity.

For cgroup v1:
Set actual total memory to be the smaller number between /proc/meminfo and memory.limit_in_bytes

For cgroup v2:
Set actual total memory to be the smaller number between /proc/meminfo and memory.max
If memory.max contains "max" string, then look at /proc/meminfo for the MemTotal, otherwise use the value in memory.max.

VELOX_CHECK_LT(system-mem-limit-gb, actual total memory capacity):

system-mem-limit-gb is higher than the actual total memory capacity. Expected: system-mem-limit-gb < actual total memory capacity.

Warning to output to worker's log:

system-mem-limit-gb is smaller than system-memory-gb. Expected: system-mem-limit-gb >= system-memory-gb.

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==

@minhancao minhancao self-assigned this Nov 26, 2024
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Nov 26, 2024
@prestodb-ci prestodb-ci requested review from a team, psnv03 and pramodsatya and removed request for a team November 26, 2024 02:04
@minhancao minhancao marked this pull request as ready for review November 26, 2024 02:07
@minhancao minhancao requested a review from a team as a code owner November 26, 2024 02:07
@minhancao minhancao changed the title [native] Add LinuxMemoryChecker warnings to ensure system-memory-gb < system-mem-limit-gb < actual total memory capacity [native] Add LinuxMemoryChecker warnings to ensure system-mem-limit-gb is reasonably set Nov 26, 2024
@minhancao minhancao force-pushed the linuxmemorychecker_mem_limit_check branch from 4478ae1 to 15f55bb Compare November 26, 2024 02:29
@minhancao minhancao changed the title [native] Add LinuxMemoryChecker warnings to ensure system-mem-limit-gb is reasonably set [native] Add LinuxMemoryChecker check/warning to ensure system-mem-limit-gb is reasonably set Nov 26, 2024
@minhancao minhancao force-pushed the linuxmemorychecker_mem_limit_check branch from 15f55bb to 7646600 Compare November 26, 2024 06:06
…ystem-mem-limit-gb < actual total memory capacity.
Copy link
Contributor

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test with fake files again just like we did with the original tests for this class?
That way we can try the "max" value for cgv2, and gigantic values and reasonable values. Basically testing the various situations we saw when investigating this.

void start() {
// Set memMaxFile to "/sys/fs/cgroup/memory/memory.limit_in_bytes" for
// cgroup v1 or "sys/fs/cgroup/memory.max" for cgroup v2.
struct stat buffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to re-declare this as a struct here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried declaring it as stat buffer; and got error:

/root/presto/presto-native-execution/./presto_cpp/main/LinuxMemoryChecker.cpp:30:9: error: expected ‘;’ before ‘buffer’
   30 |     stat buffer;

int64_t actualTotalMemory = 0;
folly::gen::byLine("/proc/meminfo") |
[&](const folly::StringPiece& line) -> void {
if (actualTotalMemory == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this check because we return immediately if it is set on line 79.

};
}

VELOX_CHECK_LT(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should allow less or equal.

return;
}
};
if (memMaxFile != "None") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to add a comment here as to what the content could be for the maxMemFile. Since we check for the "max" keyword too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:IBM PR from IBM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants