Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/elasticsearch] GKE Ingress backend unhealthy. #16599

Open
mmosierteamvelocity opened this issue May 11, 2023 · 9 comments
Open

[bitnami/elasticsearch] GKE Ingress backend unhealthy. #16599

mmosierteamvelocity opened this issue May 11, 2023 · 9 comments
Assignees
Labels
elasticsearch on-hold Issues or Pull Requests with this label will never be considered stale tech-issues The user has a technical issue about an application

Comments

@mmosierteamvelocity
Copy link

mmosierteamvelocity commented May 11, 2023

Name and Version

bitnami/elasticsearch

What architecture are you using?

amd64

What steps will reproduce the bug?

1.) Deploy helm chart to GKE using gcloud. Readiness and Liveness probes enabled. Basic auth enabled. Ingress enabled.

Everything comes up great, I can reach ES from the endpoints but not from the Ingress. I have the Ingress enabled as follows:

ingress:
  ## @param ingress.enabled Enable ingress record generation for Elasticsearch
  ##
  enabled: true
  ## @param ingress.pathType Ingress path type
  ##
  pathType: ImplementationSpecific
  ## @param ingress.apiVersion Force Ingress API version (automatically detected if not set)
  ##
  apiVersion: networking.k8s.io/v1
  ## @param ingress.hostname Default host for the ingress record
  ##
  hostname: blah.blah.com
  ## @param ingress.path Default path for the ingress record
  ## NOTE: You may need to set this to '/*' in order to use this with ALB ingress controllers
  ##
  path: /
  ## @param ingress.annotations Additional annotations for the Ingress resource. To enable certificate autogeneration, place here your cert-manager annotations.
  ## Use this parameter to set the required annotations for cert-manager, see
  ## ref: https://cert-manager.io/docs/usage/ingress/#supported-annotations
  ## e.g:
  ## annotations:
  ##   kubernetes.io/ingress.class: nginx
  ##   cert-manager.io/cluster-issuer: cluster-issuer-name
  ##
  annotations: 
    kubernetes.io/ingress.class: "gce-internal"	
    ingress.gcp.kubernetes.io/pre-shared-cert: "blah-ca"	
    kubernetes.io/ingress.regional-static-ip-name: "blahblah"	
    kubernetes.io/ingress.allow-http: "false"	
    ingress.kubernetes.io/force-ssl-redirect: "true"
  ## @param ingress.tls Enable TLS configuration for the host defined at `ingress.hostname` parameter
  ## TLS certificates will be retrieved from a TLS secret with name: `{{- printf "%s-tls" .Values.ingress.hostname }}`
  ## You can:
  ##   - Use the `ingress.secrets` parameter to create this TLS secret
  ##   - Rely on cert-manager to create it by setting the corresponding annotations
  ##   - Rely on Helm to create self-signed certificates by setting `ingress.selfSigned=true`
  ##
  tls: false
  ## @param ingress.selfSigned Create a TLS secret for this ingress record using self-signed certificates generated by Helm
  ##
  selfSigned: false
  ## @param ingress.ingressClassName IngressClass that will be be used to implement the Ingress (Kubernetes 1.18+)
  ## This is supported in Kubernetes 1.18+ and required if you have more than one IngressClass marked as the default for your cluster .
  ## ref: https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/
  ##
  ingressClassName: ""
  ## @param ingress.extraHosts An array with additional hostname(s) to be covered with the ingress record
  ## e.g:
  ## extraHosts:
  ##   - name: elasticsearch.local
  ##     path: /
  ##
  extraHosts: []
  ## @param ingress.extraPaths An array with additional arbitrary paths that may need to be added to the ingress under the main host
  ## e.g:
  ## extraPaths:
  ## - path: /*
  ##   backend:
  ##     serviceName: ssl-redirect
  ##     servicePort: use-annotation
  ##
  extraPaths: []
  ## @param ingress.extraTls TLS configuration for additional hostname(s) to be covered with this ingress record
  ## ref: https://kubernetes.io/docs/concepts/services-networking/ingress/#tls
  ## e.g:
  ## extraTls:
  ## - hosts:
  ##     - elasticsearch.local
  ##   secretName: elasticsearch.local-tls
  ##
  extraTls: []
  ## @param ingress.secrets Custom TLS certificates as secrets
  ## NOTE: 'key' and 'certificate' are expected in PEM format
  ## NOTE: 'name' should line up with a 'secretName' set further up
  ## If it is not set and you're using cert-manager, this is unneeded, as it will create a secret for you with valid certificates
  ## If it is not set and you're NOT using cert-manager either, self-signed certificates will be created valid for 365 days
  ## It is also possible to create and manage the certificates outside of this helm chart
  ## Please see README.md for more information
  ## e.g:
  ## secrets:
  ##   - name: elasticsearch.local-tls
  ##     key: |-
  ##       -----BEGIN RSA PRIVATE KEY-----
  ##       ...
  ##       -----END RSA PRIVATE KEY-----
  ##     certificate: |-
  ##       -----BEGIN CERTIFICATE-----
  ##       ...
  ##       -----END CERTIFICATE-----
  ##
  secrets: []
  ## @param ingress.extraRules Additional rules to be covered with this ingress record
  ## ref: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-rules
  ## e.g:
  ## extraRules:
  ## - host: example.local
  ##     http:
  ##       path: /
  ##       backend:
  ##         service:
  ##           name: example-svc
  ##           port:
  ##             name: http
  ##
  extraRules: [] 

Are you using any custom parameters or values?

-Auth enabled with generated TLS
-Ingress enabled with GKE annotations
-Readiness and Liveness Probes enabled

What is the expected behavior?

Backend healthy

What do you see instead?

The backend on GKE for the LB is not healthy with basic auth enabled. I have tried changing the path to /login but that did not work. The GKE backed for the ingress/LB requires a return of 200 but does not get one even though I can access ES and login via the endpoints. I suspect it does not return a 200 because of the little pop up login dialog. I got around this with kibana by setting the backend healthcheck url to /login.

Additional information

No response

@mmosierteamvelocity mmosierteamvelocity added the tech-issues The user has a technical issue about an application label May 11, 2023
@github-actions github-actions bot added the triage Triage is needed label May 11, 2023
@javsalgar javsalgar changed the title ElasticSearch GKE Ingress backend unhealthy. [bitnami/elasticsearch] GKE Ingress backend unhealthy. May 12, 2023
@github-actions github-actions bot added in-progress and removed triage Triage is needed labels May 12, 2023
@bitnami-bot bitnami-bot assigned corico44 and unassigned javsalgar May 12, 2023
@mmosierteamvelocity
Copy link
Author

Quick update, I did verify the landing page that ES uses (root or /) does return a 401. Just FYI there.

@mmosierteamvelocity
Copy link
Author

mmosierteamvelocity commented May 12, 2023

@mmosierteamvelocity
Copy link
Author

Any ideas?

@mmosierteamvelocity
Copy link
Author

Ah, perhaps I can add an anon user then make the healthcheck url /_cluster/health?

https://www.elastic.co/guide/en/elasticsearch/reference/current/anonymous-access.html#anonymous-access

How could I add this user to the values yaml?

@mmosierteamvelocity
Copy link
Author

I was able to add the anon user and allow it to see /_cluster/health without auth. This now works as the health check url. Unfortunately now I am seeing: upstream connect error or disconnect/reset before headers. retried and the latest reset reason: connection termination

Any ideas?

@mmosierteamvelocity
Copy link
Author

@corico44? Any sugestions?

@corico44
Copy link
Contributor

Hello @mmosierteamvelocity,

Sorry for the late response. I have opened an internal task to handle this problem. Thank you very much for reporting this problem! We will notify you in this ticket with any updates of the task.

@corico44 corico44 added the on-hold Issues or Pull Requests with this label will never be considered stale label May 18, 2023
@Kashemir001
Copy link

Kashemir001 commented Oct 9, 2023

If you had security.tls.restEncryption set to false, the source of the problem for failing healthchecks could be bitnami/containers#47319 . Upgrading from 19.10.6 chart version to 19.12.0 solved it for me. Note that in 19.13.0 release they changed health checks not to use healthcheck.sh at all #19677

@github-actions github-actions bot added triage Triage is needed and removed on-hold Issues or Pull Requests with this label will never be considered stale labels Oct 9, 2023
@github-actions github-actions bot added on-hold Issues or Pull Requests with this label will never be considered stale and removed triage Triage is needed labels Oct 10, 2023
@juan131
Copy link
Contributor

juan131 commented Dec 20, 2023

Hi everyone,

I reproduced the issue installing the Elasticsearch chart on a GKE cluster using the values below:

security:
  enabled: true
  elasticPassword: some-password
  tls:
    restEncryption: true
    autoGenerated: true
    usePemCerts: true
ingress:
  enabled: true
  pathType: ImplementationSpecific
  hostname: blah.blah.blah
service:
  type: NodePort
  annotations:
    cloud.google.com/backend-config: '{"default": "elasticsearch"}'
extraDeploy:
  - apiVersion: cloud.google.com/v1
    kind: BackendConfig
    metadata:
      name: elasticsearch
    spec:
      healthCheck:
        type: HTTPS
        requestPath: /_cluster/health?local=true
        port: 9200

As you can see, I added Google BackendConfig to adapt the health check used by GCP. However, the backend service is still listed as "UNHEALTHY" on GCP. The reason? There are actually two issues:

  • Authentication fails
  • TLS certs are auto-generated (self-signed)

Successful probes on port 9200 can be achieved using a "curl" command such as the one below:

$ kubectl port-forward svc/elasticsearch 9200:9200 &
$ curl -i -k --user elastic:some-password https://127.0.0.1:9200/_cluster/health?local=true
HTTP/1.1 200 OK
X-elastic-product: Elasticsearch
content-type: application/json
content-length: 383

{"cluster_name":"elastic","status":"green","timed_out":false,"number_of_nodes":3,"number_of_data_nodes":3,"active_primary_shards":0,"active_shards":0,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

As you can see the -k and --user elastic:some-password flags allow solving the two issues mentioned above. However, it's not possible to reproduce this with the available options for healthCheck configuration on BackendConfig, see:

With this in mind.. I don't know if it would be possible to make this work with GKE Ingress, but I'd recommend to ask Google support team about possible alternatives.

@juan131 juan131 self-assigned this Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
elasticsearch on-hold Issues or Pull Requests with this label will never be considered stale tech-issues The user has a technical issue about an application
Projects
None yet
Development

No branches or pull requests

5 participants