-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classification ppg #79
Conversation
… into classification_ppg
… into classification_ppg
@@ -19,6 +23,8 @@ def extract_signal_quality_features(df: pd.DataFrame, config: SignalQualityFeatu | |||
|
|||
# Compute statistics of the spectral domain signals | |||
df_windowed = extract_spectral_domain_features(config, df_windowed) | |||
|
|||
df_windowed.drop(columns = ['green'], inplace=True) # Drop the values channel since it is no longer needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ik snap wat je bedoeling is hier, en het scheelt ook code. Maar inplace=True
wordt over het algemeen afgeraden:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is het een kwestie van de parameter weghalen? En ben het eens met je laatste comment dat dit miss uberhaupt overbodig is gezien we wss de return aan willen passen!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Je moet de transformatie nog wel assignen aan een object, bijv: df_windowed = df_windowed.drop(columns=['green'])
sigma = clf['sigma'] | ||
|
||
# Prepare the data | ||
lr_clf.feature_names_in_ = ['var', 'mean', 'median', 'kurtosis', 'skewness', 'f_dom', 'rel_power', 'spectral_entropy', 'signal_to_noise', 'auto_corr'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Je zou nog kunnen kijken of je deze feature names bij het wegschrijven van de classifier naar pickle mee kan geven. Dan definieer je ze niet twee keer, en kan het ook niet mis gaan als je er één aanpast.
X_normalized = X.copy() | ||
for idx, feature in enumerate(lr_clf.feature_names_in_): | ||
X_normalized[feature] = (X[feature] - mu[idx]) / sigma[idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Je zou eens kunnen kijken naar: https://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.StandardScaler.html
|
||
# Make predictions for PPG signal quality assessment | ||
df[DataColumns.PRED_SQA_PROBA] = lr_clf.predict_proba(X_normalized)[:, 0] | ||
df.drop(columns = lr_clf.feature_names_in_, inplace=True) # Drop the features used for classification since they are no longer needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hier geldt hetzelfde als mijn eerdere comment over inplace=True
.
df[DataColumns.PRED_SQA_PROBA] = lr_clf.predict_proba(X_normalized)[:, 0] | ||
df.drop(columns = lr_clf.feature_names_in_, inplace=True) # Drop the features used for classification since they are no longer needed | ||
|
||
return df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ik denk zelfs uiteindelijk dat we helemaal geen dataframe hoeven te returnen, maar alleen een numpy.array
of pandas.Series
van de predicted probability. In principe is deze dataframe hetzelfde als de input, met één extra kolom, nietwaar? Ik heb dit zelf ook nog niet geïmplementeerd, dus food for thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Erikpostt Ik denk het ook maar daar moeten we de komende tijd maar even kritisch over nadenken wat het meest handige is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ziet er goed uit! Merging...
Hierbij de pull request voor de classification stap. Ik verwerk de comments vanuit feature extraction (kleine dingen) in de volgende PR van quantification (heart rate) omdat ik al in die branch actief aan het werk ben.