Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outliers detection/removal component #79

Open
cvd-q opened this issue Sep 11, 2021 · 2 comments
Open

Outliers detection/removal component #79

cvd-q opened this issue Sep 11, 2021 · 2 comments

Comments

@cvd-q
Copy link

cvd-q commented Sep 11, 2021

Dear TFX community members,

I'm a student in Data Science from University of Padova (Italy) and I've decided to write my master's thesis on MLops. I'm very interested in TFX and I'd like to analyze it in depth. In particular, for the experimental part my supervisor and I have in mind a Beam/Spark (MapReduce) implementation of an outlier detection algorithm, especially to deal with large dataset, we think that such preprocessing step maybe helpful. Then we would like to contribuite to this project creating a custom component.

Could this idea be useful in some way? Are you planning to release data mining components?

Thanks a lot for your advice.

Contact

[email protected]

@rcrowe-google
Copy link
Collaborator

Could this idea be useful in some way? Are you planning to release data mining components?

Thanks Jiawei, and welcome to the group! Outlier detection is very useful in many domains. We have not really focused on data mining, but there is a lot of overlap between techniques for data mining and techniques for machine learning, especially in preprocessing. If you'd like to submit a proposal, please see the instructions in the proposals folder.

@cvd-q
Copy link
Author

cvd-q commented Sep 14, 2021

Thank you for replying and confirming that the idea is something interesting, as I thought in preprocessing. Now I'm going to discuss with the prof. about which algorithms are suitable to be implemented in Beam, I will write a proposal text as soon as a definitive decision will be made.

@cvd-q cvd-q closed this as completed Nov 13, 2021
@cvd-q cvd-q reopened this Nov 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants