Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder bashrc.d by type; skip automation if launched without bin; fix env and login #300

Merged

Conversation

clcollins
Copy link
Member

@clcollins clcollins commented Aug 8, 2024

This PR reorders and reorganizes the bashrc.d files, placing pure
aliases and functions low (0-9) and labeling them with "libs", exports
and sourcing followign that (10-19) and code execution and automations
after that (20-99). It also removes the shebang (#!) shell declaration
for sourced files, per convention, and sets shellcheck's shell
declaration to allow for continued linting.

This also skips automations (ocm and sre logins) if launching without
the ocm-container binary, to allow the SRE to decide what to do.

Additionally, this fixes sourcing the AWS cli completions, which are
currently broken in this image.

This also updates the way the OCM authentication is handled, moving
authentication to the ocm-container binary. OCM Container will
authenticate with OCM using browser-based auth outside the container if
the user is not logged in, storing the config in ~/.config/ocm/ocm.json
as ocm-cli would. If the user is already logged in and the tokens are
not expired, OCM Container will use that. The ocm.json file continues
to be mouted read-only, but the OCM_URL environment variable is set
inside the container to override the OCM URL config setting in the
config file, allowing for OCM Container to be logged into multiple
environments at once, and preventing it from overwriting environments
outside of the container/in other containers. Environment will change
based on the --ocm-url provided (with production being the default).

Finally, this removes to util scripts that have been broken and out of
date since backplane was introduced, and who's functions have largely
been replaced by OCM and Backplane commands, and fixes a single makefile
target that had incorrect syntax.

In practice this should have little impact directly on ocm-contianer
users, other than they will now be able to switch environments just by
setting the config value after login. Login still defaults to
production, and can be overridden by the binary with the OCMC_OCM_URL
env variable that is passed in if the user specifies the --ocm-url
flag.

Users using the ocm-contianer binary will stay logged into OCM if their
ocm.json file exists outside the container, so device auth is only
required as their token expires, and would be done manually by the SRE,
if that were to happen.

This is partially in support of
https://issues.redhat.com/browse/OSD-15847, to allow usage of the
ocm-container image elsewhere, and in preparation for changes to the
binary to make device auth login easier.

Signed-off-by: Chris Collins [email protected]

@openshift-ci openshift-ci bot requested review from rendhalver and T0MASD August 8, 2024 21:00
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 8, 2024
@clcollins clcollins force-pushed the OSD-15847_osdctl_ocm_container branch from d23313e to 03a6286 Compare August 8, 2024 21:57
@clcollins clcollins changed the title Reorder bashrc.d by type; skip automation if launched without bin Reorder bashrc.d by type; skip automation if launched without bin; fix env and login Aug 8, 2024
@@ -243,7 +243,7 @@ func parseRefToArgs(c ContainerRef) ([]string, error) {
if c.PublishAll {
args = append(args, "--publish-all")
} else if c.LocalPorts != nil {
for service, _ := range c.LocalPorts {
for service := range c.LocalPorts {
Copy link
Member Author

@clcollins clcollins Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a linter change

pkg/ocm/ocm.go Outdated Show resolved Hide resolved
@@ -113,8 +113,7 @@ tag-n-push: registry-login tag push

# Golang-related
.PHONY: go_build
go_build:
mod fmt lint test build_snapshot
go_build: mod fmt lint test build_snapshot
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a syntax error, preventing make from parsing this target properly.

@@ -125,26 +125,13 @@ func New(cmd *cobra.Command, args []string) (*ocmContainer, error) {
maps.Copy(c.Envs, backplaneConfig.Env)
c.Volumes = append(c.Volumes, backplaneConfig.Mounts...)

// Copy the ocm config into the container
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this into the ocmConfig stuff, to match how the rest of these Envs and Mounts are created, for consistency

@@ -0,0 +1,3 @@
#!/usr/bin/env bash

complete -C '/usr/local/aws-cli/aws_completer' aws
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes the AWS bash completion

source ${HOME}/.bashrc.d/05-kube-ps1.sh

export PS1="[\W {\[\033[1;32m\]${OCMC_OCM_URL}\[\033[0m\]} \$(kube_ps1)]\$ "
export PS1="[\W {\[\033[1;32m\]\$(ocm config get url)\[\033[0m\]} \$(kube_ps1)]\$ "
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensures the env is properly displayed each time the terminal line updates, so it's always accurate and doesn't require re-sourcing .bashrc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we show short env in PS1[{prod}] ? my only concern is that it will show lengthy env like [{https://api.openshift.com}]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could always overwrite individually in the personalizations mount - IIRC those are automatically sourced at the end of the source change so you could do something like:

ocm_short_url() {
  env_url=$(ocm config get url)
  if grep staging <<< $env_url; then
    echo "staging"
    return
  elif grep integration <<< $env_url; then
    echo "integration"
    return
  fi
  echo "production"
 }
 
 export PS1="[\W {\[\033[1;32m\]\$(ocm_short_urll)\[\033[0m\]} \$(kube_ps1)]\$"

Written entirely from memory but the idea is that would allow you to customize your own PS1 to show whatever you wanted it to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get the short url from ocm somehow so we don't need ocm-container specific code to do it?
Or push our changes upstream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this had to be moved back to ${OCM_URL}, as the ocm config get url no longer necessarily gets the correct URL, as we're overriding the ocm config URL with the env var.

This will still be a long-ish URL though, but also no different from what is currently displayed in OCM Container, I believe.

I'm happy for any other commits to make this cleaner in the future though.

Copy link
Contributor

@iamkirkbater iamkirkbater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor thoughts about the drawbacks of making the ocm config readwrite and keeping a global state.

@@ -37,9 +37,9 @@ const (
const (
sshDeprecationMsg = "SSH multiplexing and Socket mounting is no longer needed or supported. Please remove the 'DISABLE_SSH_MULTIPLEXING' and 'SSH_AUTH_SOCK' fields from your configuration."
backplaneConfigDirDeprecationMsg = "The 'BACKPLANE_CONFIG_DIR' field is deprecated and will be removed in a future version. Please remove it from your configuration. You may specify an alternate backplane config file with 'BACKPLANE_CONFIG'."
ocmUrlDeprecationMsg = "The 'OCM_URL' field is deprecated and will be removed in a future version. Please remove it from your configuration."
ocmUrlDeprecationMsg = "The 'OCM_URL' field is deprecated and no longer used. Please remove it from your configuration."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check what the functionality change here is -

If an SRE is currently using an alias like alias ocm-container-stg='OCM_URL=staging ocm-container' they would need to update their alias to alias ocm-container-stg='OCMC_OCM_URL=staging ocm-container'? Do we need to update the docs with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCM-Container doesn't currently do anything if you set OCM_URL=staging ocm-container. The environment variable was already deprecated and no longer the way to switch between environments, back when the binary changed to Go. All the env vars require (and have required) the OCMC_ prefix.

This change is fully documenting that it's no longer used, and no longer referenced anywhere.

Copy link
Contributor

@rendhalver rendhalver Aug 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mine set like this:
I think that would still work after this change, I will check when I get a chance.

alias ocm-container-int='ocm-container --ocm-url=integration'
alias ocm-container-stg='ocm-container --ocm-url=staging'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Yeah, that is the current way to do it, since the ocm-container binary rewrite, and should continue to work with this PR.

utils/bashrc.d/20-ocm-login.bashrc Outdated Show resolved Hide resolved
@clcollins
Copy link
Member Author

/hold

For some further discussion

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 8, 2024
@clcollins clcollins force-pushed the OSD-15847_osdctl_ocm_container branch from 8f422c5 to c74e5e7 Compare August 13, 2024 17:11
This PR reorders and reorganizes the bashrc.d files, placing pure
aliases and functions low (0-9) and labeling them with "libs", exports
and sourcing followign that (10-19) and code execution and automations
after that (20-99). It also removes the shebang (#!) shell declaration
for sourced files, per convention, and sets shellcheck's shell
declaration to allow for continued linting.

This also skips automations (ocm and sre logins) if launching without
the ocm-container binary, to allow the SRE to decide what to do.

Additionally, this fixes sourcing the AWS cli completions, which are
currently broken in this image.

This also updates the way the OCM authentication is handled, moving
authentication to the ocm-container binary. OCM Container will
authenticate with OCM using browser-based auth outside the container if
the user is not logged in, storing the config in ~/.config/ocm/ocm.json
as ocm-cli would.  If the user is already logged in and the tokens are
not expired, OCM Container will use that.  The ocm.json file continues
to be mouted read-only, but the OCM_URL environment variable is set
inside the container to override the OCM URL config setting in the
config file, allowing for OCM Container to be logged into multiple
environments at once, and preventing it from overwriting environments
outside of the container/in other containers. Environment will change
based on the --ocm-url provided (with production being the default).

Finally, this removes to util scripts that have been broken and out of
date since backplane was introduced, and who's functions have largely
been replaced by OCM and Backplane commands, and fixes a single makefile
target that had incorrect syntax.

In practice this should have little impact directly on ocm-contianer
users, other than they will now be able to switch environments just by
setting the config value after login. Login still defaults to
production, and can be overridden by the binary with the OCMC_OCM_URL
env variable that is passed in if the user specifies the `--ocm-url`
flag.

Users using the ocm-contianer binary will stay logged into OCM if their
ocm.json file exists outside the container, so device auth is only
required as their token expires, and would be done manually by the SRE,
if that were to happen.

This is partially in support of
https://issues.redhat.com/browse/OSD-15847, to allow usage of the
ocm-container image elsewhere, and in preparation for changes to the
binary to make device auth login easier.

Signed-off-by: Chris Collins <[email protected]>
@clcollins clcollins force-pushed the OSD-15847_osdctl_ocm_container branch from cf90f11 to 2ec293e Compare August 13, 2024 21:53
@clcollins
Copy link
Member Author

clcollins commented Aug 13, 2024

/unhold

@iamkirkbater @rendhalver @samanthajayasinghe - I think I've fixed, or explained, all the issues here. This is ready for a re-review, if you have some time.

I've updated the PR description with how things exist in the latest commit, not the original commit. Let me know if you have any concerns.

It's worth noting that we should cut a new release after this merges, I think, and encourage everyone to use the latest build image, as the internals of the image changed as well as the external binary.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 13, 2024
@rendhalver
Copy link
Contributor

@iamkirkbater @rendhalver @samanthajayasinghe - I think I've fixed, or explained, all the issues here. This is ready for a re-review, if you have some time.

Having a look now. )

It's worth noting that we should cut a new release after this merges, I think, and encourage everyone to use the latest build image, as the internals of the image changed as well as the external binary.

Yep I totally agree with cutting a new release after we merge.

@rendhalver
Copy link
Contributor

rendhalver commented Aug 14, 2024

Testing notes::
I built the image and binary locally and it looks to be all working if I pass in a cluster-id.

$ ./build/ocm-container -t latest --registry "" --repository localhost --cluster-id c03103eb-1571-498d-b1fd-70587b445faa
Using config file: /home/pebrown/.config/ocm-container/ocm-container.yaml
[snip]
Checking the context on 15uq6splkva07rae2hgih6eph4vs3p8m
Encountered Errors during data collection. Displayed data may be incomplete: 
	cluster is not a HCP/Management Cluster
=================================================
osd-v4stg-aws -- 15uq6splkva07rae2hgih6eph4vs3p8m
=================================================
[snip]
[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ echo $CLUSTER_ID
15uq6splkva07rae2hgih6eph4vs3p8m
[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ echo $CLUSTER_UUID
c03103eb-1571-498d-b1fd-70587b445faa
[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ echo $CLUSTER_NAME
osd-v4stg-aws

I did notice that if we use the cluster name for osd-v4stg-aws it doesn't do all of that but that's because we are searching on the display name in utils/bashrc.d/06-sre-login-libs.bashrc and osd-v4stg-aws has it set to this:
Display Name: SRE long lived cluster in production: osd-v4stg-aws
So it's a bit of a unique case.

$ ./build/ocm-container -t latest --registry "" --repository localhost --cluster-id osd-v4stg-aws
Using config file: /home/pebrown/.config/ocm-container/ocm-container.yaml
Error: Can't save config file: can't write file '/root/.config/ocm/ocm.json': open /root/.config/ocm/ocm.json: read-only file system
[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ echo $CLUSTER_ID

[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ echo $CLUSTER_UUID

[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ echo $CLUSTER_NAME


[~ {https://api.openshift.com} (osd-v4stg-aws.z9a9.p1:default)]$ 

I am not sure how to fix the prompt if we aren't using an ENV var set to the short url.
Kirk's idea of setting the short-url before we set the prompt is a nice solution.

@rendhalver
Copy link
Contributor

I am going to put a hold on this and lgtm it.
Feel free to unhold other people are happy with it.

/hold
/lgtm

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 14, 2024
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 14, 2024
Copy link
Contributor

openshift-ci bot commented Aug 14, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: clcollins, rendhalver

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [clcollins,rendhalver]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@clcollins
Copy link
Member Author

I did notice that if we use the cluster name for osd-v4stg-aws it doesn't do all of that but that's because we are searching on the display name in utils/bashrc.d/06-sre-login-libs.bashrc and osd-v4stg-aws has it set to this:
Display Name: SRE long lived cluster in production: osd-v4stg-aws

Ah, good catch @rendhalver ! We're using display_name in that query, but most clusters have a longer description in that field. It looks like ocm-cli actually uses name when it does the same lookup:

https://github.com/openshift-online/ocm-cli/blob/main/pkg/cluster/cluster.go#L247

Using "name" instead returns the info for osd-v4stg-aws as expected:

ocm list clusters '--parameter=search=((id like '\''osd-v4stg-aws'\'' or external_id like '\''osd-v4stg-aws'\'' or name like '\''%osd-v4stg-aws%'\''))' --columns 'id, external_id, name' --no-headers
REDACTED  REDACTED  osd-v4stg-aws 

That has probably always been broken. I'm happy to put in a bug fix in a follow up PR, if that's cool with you.

clcollins added a commit to clcollins/ocm-container that referenced this pull request Aug 14, 2024
Some clusters use "display_name" as a longer, more descriptive field.
And example of this is `osd-v4stg-aws`, which has:

```
Display Name:	SRE long lived cluster in production: osd-v4stg-aws
```

This causes osd-v4stg-aws (or others) to not be found when searching by the human
friendly name, as identified by @rendhalver in a [recent pr
review](openshift#300 (comment)).

This PR adds "name" to the OCM cluster search parameters, which is the
parameter used by ocm-cli when it performs the same search.

Signed-off-by: Chris Collins <[email protected]>
@clcollins
Copy link
Member Author

Actually, @rendhalver, I just went ahead and fixed that: #301

@rendhalver
Copy link
Contributor

That has probably always been broken. I'm happy to put in a bug fix in a follow up PR, if that's cool with you.

Oh I am totally happy getting that sorted after this one is merged. :)

@rendhalver
Copy link
Contributor

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 14, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 2d9e91d into openshift:master Aug 14, 2024
2 checks passed
clcollins added a commit to clcollins/ocm-container that referenced this pull request Aug 14, 2024
Some clusters use "display_name" as a longer, more descriptive field.
And example of this is `osd-v4stg-aws`, which has:

```
Display Name:	SRE long lived cluster in production: osd-v4stg-aws
```

This causes osd-v4stg-aws (or others) to not be found when searching by the human
friendly name, as identified by @rendhalver in a [recent pr
review](openshift#300 (comment)).

This PR adds "name" to the OCM cluster search parameters, which is the
parameter used by ocm-cli when it performs the same search.

Signed-off-by: Chris Collins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants