Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] metadata-grpc-service mysql_real_connect failed (SSL enabled) #6711

Open
andrijaperovic opened this issue Oct 10, 2021 · 27 comments
Open

Comments

@andrijaperovic
Copy link

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    Kubeflow Pipelines Standalone

  • KFP version:
    1.7.0-alpha.1

  • KFP SDK version:
    1.6.2

Steps to reproduce

  1. Enable SSL on external MySQL configuration (Azure MySQL Server)
  2. Restart metadata-grpc-deployment pod
  3. Observe error:
    2021-10-08 19:32:42.461171: F ml_metadata/metadata_store/metadata_store_server_main.cc:226] Non-OK-status: status status: Internal: mysql_real_connect failed: errno: 0, error: MetadataStore cannot be created with the given connection config.
    

Expected result

metadata-grpc-deployment should be initialized (not sure if SSL parameter is expected here in deployment config).

Materials and Reference

Running gcr.io/tfx-oss-public/ml_metadata_store_server:1.0.0.


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@andrijaperovic
Copy link
Author

google/ml-metadata#20

@andrijaperovic
Copy link
Author

Tried to pass metadata_store_server_config_file command line flag to /bin/metadata_store_server in text protobuf format, but observed a similar error:

      - args:
        - --grpc_port=8080
        - --enable_database_upgrade=true
        - --metadata_store_server_config_file=/config
        command:
        - /bin/metadata_store_server

Config:

connection_config {
      mysql {
        host: "..."
        port: int
        database: "..."
        user: "..."
        password: "..."
        ssl_options {
          verify_server_cert: true
          capath: "/etc/ssl/certs/my-cert.pem"
        }
      }
    }

Error:

2021-10-11 18:04:01.536992: F ml_metadata/metadata_store/metadata_store_server_main.cc:226] Non-OK-status: status status: Internal: mysql_real_connect failed: errno: 0, error: MetadataStore cannot be created with the given connection config.

Is there any example documentation regarding SSL setup for metadata-grpc as it pertains to kubeflow pipelines?

@zijianjoy
Copy link
Collaborator

Hello @andrijaperovic , would you like to check the logs in cloudsqlproxy deployment? It should have more information about why the connection has failed.

@andrijaperovic
Copy link
Author

@zijianjoy thank you for following up.

I thought cloudsqlproxy is only applicable for gcp based on the kustomize template:

https://github.com/kubeflow/pipelines/blob/74c7773ca40decfd0d4ed40dc93a6af591bbc190/manifests/kustomize/env/gcp/cloudsql-proxy/cloudsql-proxy-deployment.yaml

We are running in an Azure (AKS) environment, so there is no cloudsqlproxy deployment in our KFP installation.

@zijianjoy
Copy link
Collaborator

I am not very familiar with SQL connection in Azure, the article I can find is https://docs.microsoft.com/en-us/azure/azure-sql/database/connectivity-architecture.

@berndverst for more help on Azure.

@andrijaperovic
Copy link
Author

@zijianjoy since this issue is due to a sub-component shipped from google/ml-metadata I've filed a corresponding ticket there:
google/ml-metadata#130

Have reached out to @BrianSong to see whether the google/ml-metadata ticket can be reopened.
After my discussion with @berndverst this issue can be addressed in google/ml-metadata only.

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Mar 2, 2022
@ReggieCarey
Copy link

I'm experiencing this problem (or one that looks like it). The logs contain repeated

W0213 18:24:37.372198 3635 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0213 18:24:37.372260 3637 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0213 18:24:37.372275 3638 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0213 18:24:37.372458 3639 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0213 18:24:37.372793 3621 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0213 18:24:37.465122 3622 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Feb 13, 2023
@ReggieCarey
Copy link

This causes Kubeflow Artifacts to fail:
W0308 15:29:32.612677 247 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.613674 245 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.614279 233 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.614387 248 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.614480 236 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.616160 240 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.616427 244 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.616662 249 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.618607 201 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.618873 194 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.625411 211 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.625411 222 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.625516 208 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626050 228 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626132 183 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626195 217 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626196 215 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626279 207 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626308 220 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.626356 221 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.630316 246 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.630416 237 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.631172 232 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.631250 238 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.631335 241 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:
W0308 15:29:32.631448 242 metadata_store_service_impl.cc:432] Failed to connect to the database: mysql_real_connect failed: errno: , error:

@mohamedFaris47
Copy link

I have a similar problem with kubeflow. MySQL pod is not running and so grpc and ml-pipeline pods can't connect to the database and are stuck in CrashLoopBackOff.
Did you find a solution?

@dashanji
Copy link

I have a similar problem with kubeflow. MySQL pod is not running and so grpc and ml-pipeline pods can't connect to the database and are stuck in CrashLoopBackOff. Did you find a solution?

+1

@mohamedFaris47
Copy link

For me, when I ran the kubeflow deployment command again, the issue was fixed. Some time mysql pod gets deleted (did not know why) which caused this problem. So running the deployment command again fixed it

@dashanji
Copy link

Hi @mohamedFaris47, Thanks for the kind reply. What's the meaning of run the kubeflow command again? BTW, do you refer to the kubeflow installation doc to deploy the KFP standalone instance?

@mohamedFaris47
Copy link

I mean running the deployment command mentioned in the Kubeflow documentation here

@dashanji
Copy link

I mean running the deployment command mentioned in the Kubeflow documentation here

I see, but the deployment still failed after running the command. The error mysqld: Can't read dir of '/etc/mysql/conf.d/' (OS errno 13 - Permission denied) was encountered in the mysql pod.

@mohamedFaris47
Copy link

I don't know the cause of this permission denial unfortunately. That's different from my case, I had only the mysql pod missing so redeploying got it fixed.

@rimolive
Copy link
Member

Is this issue still happening in KFP 2.0.5?

@revit13
Copy link
Contributor

revit13 commented Apr 1, 2024

I encounter this error on Ubuntu 20.04.6 LTS when deploying standalone kfp 1.8.5 on a Kind cluster. Any help in resolving it is appreciated

[pod/metadata-grpc-deployment-b45564d7d-rqf46/container] WARNING: Logging before InitGoogleLogging() is written to STDERR
[pod/metadata-grpc-deployment-b45564d7d-rqf46/container] F0331 20:05:33.575326     1 metadata_store_server_main.cc:236] Check failed: absl::OkStatus() == status (OK vs. INTERNAL: mysql_real_connect failed: errno: , error:  [mysql-error-info='']) MetadataStore cannot be created with the given connection config.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 15, 2024
@aaj-synth
Copy link

not-stale

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 28, 2024
@rimolive
Copy link
Member

@aaj-synth Please let us know if this issue still occurs on KFP 2.2.0

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Aug 28, 2024
Copy link

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@922tech
Copy link

922tech commented Nov 6, 2024

/reopen

Copy link

@922tech: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@HumairAK
Copy link
Collaborator

HumairAK commented Nov 6, 2024

/reopen

@google-oss-prow google-oss-prow bot reopened this Nov 6, 2024
Copy link

@HumairAK: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-actions github-actions bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants