Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Outlier detection #869

Closed
wants to merge 10 commits into from
Closed

Conversation

pavitraag
Copy link
Contributor

Techniques for Outlier detection

percentile.ipynb
zscore.ipynb
IQR.ipynb

Copy link

Thank you for submitting your pull request! 🙌 We'll review it as soon as possible. If there are any specific instructions or feedback regarding your PR, we'll provide them here. Thanks again for your contribution! 😊

@pavitraag
Copy link
Contributor Author

Closes #853

Outliers in Data

An outlier is a data point that significantly deviates from the majority of the data, either being much higher or lower. Outliers can result from measurement or execution errors and can significantly impact machine learning algorithms. Analyzing outliers is known as outlier analysis or outlier mining.

Importance of Outlier Detection in Machine Learning

  1. Biased Models: Outliers can skew a model towards these extreme values, leading to poor generalization.
  2. Reduced Accuracy: Outliers introduce noise, complicating the detection of true data patterns and reducing model accuracy.
  3. Increased Variance: Outliers increase model sensitivity to data changes, reducing stability and reliability.
  4. Reduced Interpretability: Outliers obscure model insights, making predictions less trustworthy and impeding performance improvements.

Techniques for Outlier Detection

  1. Percentile Method: Identifies outliers based on their position relative to a certain percentile range.
  2. Z-Score Method: Detects outliers by calculating how many standard deviations a data point is from the mean.
  3. IQR Method: Uses the interquartile range to identify outliers as data points lying outside 1.5 times the IQR above the third quartile or below the first quartile.

@pavitraag pavitraag closed this Jun 27, 2024
@pavitraag pavitraag reopened this Jun 27, 2024
@invigorzz313
Copy link
Contributor

@pavitraag For every PR, create a new branch in your forked repo and commit to it. Not directly from master.
And there are 2 project files in this PR.
image

@pavitraag pavitraag closed this Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants