Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm_learn_from_db and MySQL #366

Closed
ericemc3 opened this issue May 17, 2020 · 15 comments · Fixed by #1123
Closed

dm_learn_from_db and MySQL #366

ericemc3 opened this issue May 17, 2020 · 15 comments · Fixed by #1123
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@ericemc3
Copy link

looking forward to playing with this within MySQL...

@krlmlr
Copy link
Collaborator

krlmlr commented May 17, 2020

I wonder if postgres_learn_query() works well enough for MariaDB? I never tried, would review a pull request.

@krlmlr krlmlr added the enhancement New feature or request label May 17, 2020
@ericemc3
Copy link
Author

It won't probably work as if, because you won't find in MySQL the same exact column names.
We are (in my company) implementing the same db model with various databases (such as MySQL, Postgres or MsSQL).

Useful requests for MySQL would be something like:

/*foreign keys*/    
SELECT column_name, constraint_name, referenced_table_name, referenced_column_name, table_name, ordinal_position
		FROM information_schema.key_column_usage
		WHERE referenced_table_name IS NOT NULL
		and table_schema = 'myschema'
		order by table_name,ordinal_position
			
/*primary keys*/  			
SELECT column_name, constraint_name, referenced_table_name, referenced_column_name, table_name, ordinal_position
		FROM information_schema.key_column_usage 
		WHERE table_schema = 'myschema'
		AND constraint_name = 'primary'
		
/*column info*/  			
SELECT * FROM information_schema.columns WHERE table_schema = 'myschema'

@krlmlr krlmlr added this to the 0.2.2 milestone May 11, 2021
@krlmlr krlmlr modified the milestones: 0.2.2, 0.2.4 Jun 20, 2021
@krlmlr krlmlr modified the milestones: 0.2.4, 0.2.6 Oct 9, 2021
@krlmlr krlmlr self-assigned this Oct 18, 2021
@noamross
Copy link

noamross commented Nov 9, 2021

Would you take a PR for this or is it dependent on your expected changes in #342 and #229?

@krlmlr
Copy link
Collaborator

krlmlr commented Nov 15, 2021

Thanks. Let me take another stab getting the internal structure right, happy to take PRs after that.

@krlmlr krlmlr modified the milestones: 0.2.6, 0.3.2 Nov 16, 2021
@krlmlr krlmlr modified the milestones: 0.3.2, 0.2.8, 0.2.9 Apr 7, 2022
@krlmlr
Copy link
Collaborator

krlmlr commented May 30, 2022

@noamross: Now that we have an idea what dm_meta() should look like, we could take another stab.

I've found that, across databases, constraint_column_usage is the most inconsistent. It seems that REFERENTIAL_CONSTRAINTS is the thing to use in MySQL/MariaDB?

Another thing that gave me headaches for Postgres are compound keys -- constraint_column_usage seems to lack a crucial column that helps infer compound keys correctly, I worked around with a custom query against the native views.

@krlmlr krlmlr modified the milestones: 0.2.9, 0.3.0 Jun 6, 2022
@krlmlr
Copy link
Collaborator

krlmlr commented Jun 17, 2022

We're very close to an implementation. Do you have a chance to test it in the next few days?

@noamross
Copy link

So currently learning constraints is failing for me on a Dolt Database. Dolt has the same information schema as MariaDB/MySQL. But while DoltConnection inherits from MariaDBConnection, src_DoltConnection does not inherit from src_MariaDBConnection. This is because dbplyr::src_dbi creates src_ classes with subclass <- paste0("src_", class(con)[[1]]), ignoring other inheritance. When I debug this and modify the class so is_mariadb() at https://github.com/cynkra/dm/pull/1123/files#diff-44b9c1498a56325c16fa5943d640390ae701cf33675e3c88c0b079708b763a5aR70 results in TRUE, foreign and primary keys are learned correctly.

@noamross
Copy link

I'm not sure the best way to fix this. Do you think a PR to dbplyr is best, even though it will take a while to work through the system?

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

Thanks. Would it work if we tweaked is_mariadb() here?

@noamross
Copy link

A hack including src_DoltConnection, and src_DoltLocalConnection would work.

@noamross
Copy link

Also, for Dolt (but also possibly MySQL/RMariaDB?), the data model returned includes all the tables and columns in information_schema, as well. Perhaps we want to filter out table_schema == "information_schema" by default?

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

Thanks. Self-reflection about information_schema is a feature, I agree we might want to leave it out if schema = NULL .

Happy to review PRs.

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 21, 2022

Dolt hack added in 337de7a.

Should we add Dolt to our CI/CD (and perhaps also to RMariaDB's)?

@noamross
Copy link

That would be great. Note that you're as likely to find bugs in Dolt as it's still pretty beta. (But the Dolt team responds fast, e.g. ecohealthalliance/doltr#46 (comment)).

@github-actions
Copy link
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Development

Successfully merging a pull request may close this issue.

3 participants