Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix module test failure on Debian 12 #818

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jepio
Copy link
Member

@jepio jepio commented Nov 21, 2024

Description

Two issues:

apt-cache

On a freshly configured system (VM/container/server) the apt cache may not be populated initially. This causes any remediation task that requires package installation to happen to fail. Ensure we update the package cache on the first invocation of a function that attempts to install a package.

With this change apt-get update is done once and only on demand. Some remediation functions need to be revised to better interact with this, for example RemediateEnsureAuditdServiceIsRunning. RemediateEnsureAuditdServiceIsRunning attempts to install 4 different package candidates, some of which may not be available in a distro version at all. Each such probe could trigger an apt-get update, even if a later package in the list is already installed. This scenario is not critical because:

  • for MC there is a Test() call that would never invoke apt-get upate and that would suppress the call to Set()
  • for standalone Osconfig we would have one apt-get update per lifetime of osconfig-platform daemon which is acceptable.

cronie

The order of remediate tasks differs between moduletest and MC policy. The remediation tasks that modify permissions of cron files in /etc/ do nothing until these files are created by installing crond. Subsequent audit tasks related to these cron files can fail. Move the remediateEnsureCronServiceIsEnabled task in moduletest earlier, so that the files are created before the permission remediation tasks run.
dropped, because it's superseded by #821 which has the correct ordering.

Checklist

  • I have read the contribution guidelines.
  • I added unit-tests to validate my changes. All unit tests are passing.
  • I have merged the latest main branch prior to this PR submission.
  • I submitted this PR against the main branch.

@jepio jepio requested a review from a team as a code owner November 21, 2024 16:38
@jepio
Copy link
Member Author

jepio commented Nov 21, 2024

Two issues:

Happy to split the PR and or delay this (since it only affects remediation), but combined here for narrowing down moduletest failures.

OsConfigLogError(log, "AptGetUpdateOnce: apt-get update failed with %d", status);
}

g_aptGetUpdateExecuted = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set to true even though it fails on L198 so it can try again on a subsequent run?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's try that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@AhmedBM AhmedBM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find! Looks good but just curious about the value of having the AptGetUpdateOnce not returning true on failure.

OsConfigLogError(log, "AptGetUpdateOnce: apt-get update failed with %d", status);
}

g_aptGetUpdateExecuted = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g_aptGetUpdateExecuted = true;

Should this be moved after line 192, so when 'apt-get update' fails, it could be retried at the next call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

status = ExecuteCommand(NULL, command, false, false, 0, 0, NULL, NULL, log);
if (0 == status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: combine in one line to match existing code?

Like this:

if (0 == (status = ExecuteCommand(NULL, command, false, false, 0, 0, NULL, NULL, log)))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@MariusNi
Copy link
Contributor

Yes, second that, this is an awesome find. Also same question (do we need to retry if the command fails?)


In reply to: 2452224606

@MariusNi
Copy link
Contributor

MariusNi commented Nov 21, 2024

No need to split in my opinion. If you can add the fix for the ModuleTest (the SecurityBaseline module test recipe) in this PR, it would be great. Make sure that you will fix the ordering there of the MIM objects under test to match the policy MOF for all sequences, Get/Audit, Init, and Set/Remediate. The test recipe is at https://github.com/Azure/azure-osconfig/blob/main/src/modules/test/recipes/SecurityBaselineTests.json. Please validate this locally before submitting to PR, with: sudo modules/test/moduletest ../src/modules/test/recipes/SecurityBaselineTests.json


In reply to: 2491873158

@MariusNi
Copy link
Contributor

Another thing to consider, we cannot rely on the exact order in which the MC calls will arrive. So we may need also a change in Asb.c where in case one check is executed before the other, the necessary update to happen (with a new global there like here).


In reply to: 2492095717

…installation

On a freshly configured system (VM/container/server) the apt cache may not be
populated.  This causes any remediation task that requires package installation
to happen to fail. Ensure we update the package cache on the first invocation
of a function that attempts to install a package.
@jepio
Copy link
Member Author

jepio commented Nov 25, 2024

No need to split in my opinion. If you can add the fix for the ModuleTest (the SecurityBaseline module test recipe) in this PR, it would be great. Make sure that you will fix the ordering there of the MIM objects under test to match the policy MOF for all sequences, Get/Audit, Init, and Set/Remediate. The test recipe is at https://github.com/Azure/azure-osconfig/blob/main/src/modules/test/recipes/SecurityBaselineTests.json. Please validate this locally before submitting to PR, with: sudo modules/test/moduletest ../src/modules/test/recipes/SecurityBaselineTests.json

I've pushed this part out to a separate PR because of the size of the changes: #821.

Another thing to consider, we cannot rely on the exact order in which the MC calls will arrive. So we may need also a change in Asb.c where in case one check is executed before the other, the necessary update to happen (with a new global there like here).

Could you explain what you mean by "we cannot rely on the exact order in which the MC calls will arrive"? I thought the order of operations is defined by the MOF file.

@jepio jepio mentioned this pull request Nov 26, 2024
7 tasks
@jepio
Copy link
Member Author

jepio commented Nov 26, 2024

A different fix is necessary to fix failures for RHEL9: #822

@AhmedBM
Copy link
Contributor

AhmedBM commented Nov 27, 2024

Should we abandon this PR given its covered in #823?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants