Skip to content

Commit

Permalink
Merge pull request #378 from Cray-HPE/develop
Browse files Browse the repository at this point in the history
2.30.0 for CSM 1.6
  • Loading branch information
mharding-hpe authored Oct 7, 2024
2 parents 23ad70c + 596f9bf commit d6b6a15
Show file tree
Hide file tree
Showing 18 changed files with 885 additions and 496 deletions.
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [2.30.0] - 2024-10-07
### Added
#### BOS options
- `ims_errors_fatal`: This determines whether or not an IMS failure
is considered fatal even when BOS could continue despite the failure. Specifically,
this comes up when validating image architecture in a boot set. By default
this is false. Note that this has no effect for boot sets that:
- Have non-IMS images
- Have IMS images but the image does not exist in IMS
- Have `Other` architecture
- `ims_images_must_exist`: This determines whether or not BOS considers it a fatal error if
a boot set has an IMS boot image which does not exist in IMS. If false (the default), then
BOS will only log warnings about these. If true, then these will cause boot set validations
to fail. Note that if `ims_images_must_exist` is true but `ims_errors_fatal` is false, then
a failure to determine whether or not an image is in IMS will NOT result in a fatal error.

### Changed
- Refactored some BOS Options code to use abstract base classes, to avoid code duplication.
- Alphabetized options in API spec
- Refactored `controllers/v2/boot_sets.py` into its own module, for clarity

## [2.29.0] - 2024-10-01
### Added
- Run `pylint` during builds
Expand Down
48 changes: 33 additions & 15 deletions api/openapi.yaml.in
Original file line number Diff line number Diff line change
Expand Up @@ -999,6 +999,12 @@ components:
minLength: 1
# This allows for over 10 years using the smallest units (minutes)
maxLength: 8
default_retry_policy:
type: integer
description: The default maximum number attempts per node for failed actions.
example: 1
minimum: 0
maximum: 1048576
disable_components_on_completion:
type: boolean
description: |
Expand All @@ -1010,6 +1016,24 @@ components:
minimum: 0
# A little over a year
maximum: 33554432
ims_errors_fatal:
type: boolean
description: |
This option modifies how BOS behaves when validating the architecture of a boot image in a boot set.
Specifically, this option comes into play when BOS needs data from IMS in order to do this validation, but
IMS is unreachable.
In the above situation, if this option is true, then the validation will fail.
Otherwise, if the option is false, then a warning will be logged, but the validation will not
be failed because of this.
ims_images_must_exist:
type: boolean
description: |
This option modifies how BOS behaves when validating a boot set whose boot image appears to be from IMS.
Specifically, this option comes into play when the image does not actually exist in IMS.
In the above situation, if this option is true, then the validation will fail.
Otherwise, if the option is false, then a warning will be logged, but the validation will not
be failed because of this. Note that if ims_images_must_exist is true but ims_errors_fatal is false, then
a failure to determine whether or not an image is in IMS will NOT result in a fatal error.
logging_level:
type: string
description: The logging level for all BOS services
Expand All @@ -1020,36 +1044,30 @@ components:
minimum: 0
# Over 12 days
maximum: 1048576
max_power_on_wait_time:
max_component_batch_size:
type: integer
description: How long BOS will wait for a node to power on before calling power on again (in seconds)
description: The maximum number of Components that a BOS operator will process at once. 0 means no limit.
example: 1000
minimum: 0
# Over 12 days
maximum: 1048576
maximum: 131071
max_power_off_wait_time:
type: integer
description: How long BOS will wait for a node to power off before forcefully powering off (in seconds)
minimum: 0
# Over 12 days
maximum: 1048576
polling_frequency:
max_power_on_wait_time:
type: integer
description: How frequently the BOS operators check Component state for needed actions (in seconds)
description: How long BOS will wait for a node to power on before calling power on again (in seconds)
minimum: 0
# Over 12 days
maximum: 1048576
default_retry_policy:
polling_frequency:
type: integer
description: The default maximum number attempts per node for failed actions.
example: 1
description: How frequently the BOS operators check Component state for needed actions (in seconds)
minimum: 0
# Over 12 days
maximum: 1048576
max_component_batch_size:
type: integer
description: The maximum number of Components that a BOS operator will process at once. 0 means no limit.
example: 1000
minimum: 0
maximum: 131071
reject_nids:
type: boolean
description: |
Expand Down
168 changes: 168 additions & 0 deletions src/bos/common/options.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
#
# MIT License
#
# (C) Copyright 2024 Hewlett Packard Enterprise Development LP
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
#
from abc import ABC, abstractmethod
from typing import Any


# This is the source of truth for default option values. All other BOS
# code should either import this dict directly, or (preferably) access
# its values indirectly using a DefaultOptions object
DEFAULTS = {
'cleanup_completed_session_ttl': "7d",
'clear_stage': False,
'component_actual_state_ttl': "4h",
'default_retry_policy': 3,
'disable_components_on_completion': True,
'discovery_frequency': 300,
'ims_errors_fatal': False,
'ims_images_must_exist': False,
'logging_level': 'INFO',
'max_boot_wait_time': 1200,
'max_component_batch_size': 2800,
'max_power_off_wait_time': 300,
'max_power_on_wait_time': 120,
'polling_frequency': 15,
'reject_nids': False,
'session_limit_required': False
}

class BaseOptions(ABC):
"""
Abstract base class for getting BOS option values
"""

@abstractmethod
def get_option(self, key: str) -> Any:
"""
Return the value for the specified option
"""

# These properties call the method responsible for getting the option value.
# All these do is convert the response to the appropriate type for the option,
# and return it.

@property
def cleanup_completed_session_ttl(self) -> str:
return str(self.get_option('cleanup_completed_session_ttl'))

@property
def clear_stage(self) -> bool:
return bool(self.get_option('clear_stage'))

@property
def component_actual_state_ttl(self) -> str:
return str(self.get_option('component_actual_state_ttl'))

@property
def default_retry_policy(self) -> int:
return int(self.get_option('default_retry_policy'))

@property
def disable_components_on_completion(self) -> bool:
return bool(self.get_option('disable_components_on_completion'))

@property
def discovery_frequency(self) -> int:
return int(self.get_option('discovery_frequency'))

@property
def ims_errors_fatal(self) -> bool:
return bool(self.get_option('ims_errors_fatal'))

@property
def ims_images_must_exist(self) -> bool:
return bool(self.get_option('ims_images_must_exist'))

@property
def logging_level(self) -> str:
return str(self.get_option('logging_level'))

@property
def max_boot_wait_time(self) -> int:
return int(self.get_option('max_boot_wait_time'))

@property
def max_component_batch_size(self) -> int:
return int(self.get_option('max_component_batch_size'))

@property
def max_power_off_wait_time(self) -> int:
return int(self.get_option('max_power_off_wait_time'))

@property
def max_power_on_wait_time(self) -> int:
return int(self.get_option('max_power_on_wait_time'))

@property
def polling_frequency(self) -> int:
return int(self.get_option('polling_frequency'))

@property
def reject_nids(self) -> bool:
return bool(self.get_option('reject_nids'))

@property
def session_limit_required(self) -> bool:
return bool(self.get_option('session_limit_required'))


class DefaultOptions(BaseOptions):
"""
Returns the default value for each option
"""
def get_option(self, key: str) -> Any:
if key in DEFAULTS:
return DEFAULTS[key]
raise KeyError(key)


class OptionsCache(DefaultOptions, ABC):
"""
Handler for reading configuration options from the BOS API/DB
This caches the options so that frequent use of these options do not all
result in network/DB calls.
"""
def __init__(self, update_on_create:bool=True):
super().__init__()
if update_on_create:
self.update()
else:
self.options = {}

def update(self) -> None:
"""Refreshes the cached options data"""
self.options = self._get_options()

@abstractmethod
def _get_options(self) -> dict:
"""Retrieves the current options from the BOS api/DB"""

def get_option(self, key: str) -> Any:
if key in self.options:
return self.options[key]
try:
return super().get_option(key)
except KeyError as err:
raise KeyError(f'Option {key} not found and no default exists') from err
30 changes: 14 additions & 16 deletions src/bos/operators/utils/boot_image_metadata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,56 +22,54 @@
# OTHER DEALINGS IN THE SOFTWARE.
#

class BootImageMetaData:
def __init__(self, boot_set):
"""
Base class for BootImage Metadata object
"""
from abc import abstractmethod, ABC


class BootImageMetaData(ABC):
"""
Base class for BootImage Metadata
"""

def __init__(self, boot_set: dict):
self._boot_set = boot_set
self.artifact_summary = {}

@property
@abstractmethod
def metadata(self):
"""
Get the initial object metadata. This metadata may contain information
about the other boot objects -- kernel, initrd, rootfs, kernel parameters.
"""
return None

@property
@abstractmethod
def kernel(self):
"""
Get the kernel
"""
return None

@property
@abstractmethod
def initrd(self):
"""
Get the initrd
"""
return None

@property
@abstractmethod
def boot_parameters(self):
"""
Get the boot parameters
"""
return None

@property
@abstractmethod
def rootfs(self):
"""
Get the kernel
"""
return None

@property
def arch(self):
"""
Get the arch
"""
return None

class BootImageMetaDataBadRead(Exception):
"""
Expand Down
2 changes: 1 addition & 1 deletion src/bos/operators/utils/boot_image_metadata/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ class BootImageMetaDataFactory:
Conditionally create new instances of the BootImageMetadata based on
the type of the BootImageMetaData specified
"""
def __init__(self, boot_set):
def __init__(self, boot_set: dict):
self.boot_set = boot_set

def __call__(self):
Expand Down
Loading

0 comments on commit d6b6a15

Please sign in to comment.