Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase efficiency gait #80

Merged
merged 27 commits into from
Dec 3, 2024
Merged

Increase efficiency gait #80

merged 27 commits into from
Dec 3, 2024

Conversation

Erikpostt
Copy link
Contributor

Het doel van deze PR is om de pipeline een stuk efficiënter te maken, met name bij preprocessing en feature extraction. Ik heb de volgende aanpassingen gemaakt:

  • Quantification weggehaald voor nu (deze komt in een volgende PR terug)
  • Feature names gestandaardizeerd met _sensor_name__X
  • Cepstral coefficients (CCs) aangepast naar Mel Frequency Cepstral coefficients (MFFCs) na overleg met Nienke
  • IMU preprocessing werkt nu met numpy, waardoor o.a. e.e.a geparalleliseerd wordt
  • Tabulate windows werkt nu met numpy, waardoor de tremor and HR pipeline nu tijdelijk gebruik maken van tabulate_windows_legacy wat nog gebruik maakt van pandas.
  • Classifiers, scalers en thresholds zijn geupdate met aanpassingen o.b.v pdathome

@Erikpostt Erikpostt requested a review from Copilot December 3, 2024 09:52

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 20 out of 33 changed files in this pull request and generated no suggestions.

Files not reviewed (13)
  • tests/data/0.classification/gait/scalers/gait_detection_scaler_params.json: Language not supported
  • tests/data/0.classification/gait/scalers/gait_filtering_scaler_params.json: Language not supported
  • tests/data/0.classification/gait/thresholds/gait_detection_threshold.txt: Language not supported
  • tests/data/0.classification/gait/thresholds/gait_filtering_threshold.txt: Language not supported
  • tests/data/2.preprocessed_data/imu/accelerometer_meta.json: Language not supported
  • tests/data/3.extracted_features/gait/arm_activity_meta.json: Language not supported
  • tests/data/3.extracted_features/gait/gait_meta.json: Language not supported
  • tests/data/5.quantification/gait/arm_swing_meta.json: Language not supported
  • docs/notebooks/gait/gait_analysis.ipynb: Evaluated as low risk
  • src/paradigma/heart_rate/heart_rate_analysis.py: Evaluated as low risk
  • src/paradigma/tremor/tremor_analysis.py: Evaluated as low risk
  • src/paradigma/constants.py: Evaluated as low risk
  • src/paradigma/preprocessing_config.py: Evaluated as low risk
Comments skipped due to low confidence (7)

src/paradigma/imu_preprocessing.py:6

  • The term interp1d should be capitalized as interp1D to match the convention used by the scipy.interpolate module.
from scipy.interpolate import interp1d

src/paradigma/imu_preprocessing.py:43

  • The condition config.acceleration_units == 'm/s^2' should be compared using DataUnits.ACCELERATION for consistency.
if config.acceleration_units == 'm/s^2':

src/paradigma/imu_preprocessing.py:48

  • [nitpick] The variable name filter_configs is ambiguous and does not clearly convey its purpose.
filter_configs = {

src/paradigma/imu_preprocessing.py:199

  • The check if not np.all(np.diff(time_abs_array) > 0) should be performed before creating t_resampled to avoid unnecessary computation.
if not np.all(np.diff(time_abs_array) > 0):

src/paradigma/imu_preprocessing.py:217

  • The parameter single_sensor_col in the docstring should be updated to data.
single_sensor_col: np.ndarray,

src/paradigma/imu_preprocessing.py:243

  • The return type in the docstring should be updated to np.ndarray.
sensor_column_filtered: pd.Series

tests/test_gait_analysis.py:116

  • The test case for arm swing quantification has been commented out. This should be addressed to ensure that the functionality is properly tested.
# def test_6_arm_swing_quantification_output(shared_datadir: Path):
@Erikpostt Erikpostt requested a review from Copilot December 3, 2024 10:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 21 out of 34 changed files in this pull request and generated 1 suggestion.

Files not reviewed (13)
  • tests/data/0.classification/gait/scalers/gait_detection_scaler_params.json: Language not supported
  • tests/data/0.classification/gait/scalers/gait_filtering_scaler_params.json: Language not supported
  • tests/data/0.classification/gait/thresholds/gait_detection_threshold.txt: Language not supported
  • tests/data/0.classification/gait/thresholds/gait_filtering_threshold.txt: Language not supported
  • tests/data/2.preprocessed_data/imu/accelerometer_meta.json: Language not supported
  • tests/data/3.extracted_features/gait/arm_activity_meta.json: Language not supported
  • tests/data/3.extracted_features/gait/gait_meta.json: Language not supported
  • tests/data/5.quantification/gait/arm_swing_meta.json: Language not supported
  • docs/notebooks/gait/gait_analysis.ipynb: Evaluated as low risk
  • src/paradigma/heart_rate/heart_rate_analysis.py: Evaluated as low risk
  • src/paradigma/tremor/tremor_analysis.py: Evaluated as low risk
  • src/paradigma/constants.py: Evaluated as low risk
  • src/paradigma/ppg_preprocessing.py: Evaluated as low risk
Comments skipped due to low confidence (6)

src/paradigma/imu_preprocessing.py:214

  • The parameter 'single_sensor_col' was renamed to 'data', but the docstring was not updated to reflect this change. Update the docstring to match the new parameter name.
single_sensor_col: np.ndarray,

src/paradigma/segmenting.py:10

  • [nitpick] The parameter 'columns' in 'tabulate_windows' should be clearly documented to explain its purpose.
def tabulate_windows(config, df, columns):

src/paradigma/segmenting.py:109

  • The function 'create_segments' documentation should be updated to reflect the use of 'config.time_colname' instead of 'time_column_name'.
def create_segments(config, df):

src/paradigma/gait/gait_analysis_config.py:16

  • [nitpick] The variable name 'l_axes' is ambiguous. It should be renamed to 'axes_list' for clarity.
self.l_axes = ["x", "y", "z"]

src/paradigma/gait/gait_analysis_config.py:78

  • [nitpick] The variable name 'window_step_length_s' should be consistent with 'window_length_s'. Consider renaming it to 'window_step_size_s'.
self.window_step_length_s: int = 1

src/paradigma/gait/gait_analysis_config.py:102

  • The variable 'self.sensor' is used before being initialized. Ensure 'self.sensor' is correctly set before this line.
f"{self.sensor}_std_norm": DataUnits.GRAVITY,

time_start='min', # Start time (min time in each segment)
time_end='max' # End time (max time in each segment)
).reset_index()

return df_segment_times


def discard_segments(df, segment_nr_colname, min_length_segment_s, sampling_frequency):
def discard_segments(config, df):
Copy link
Preview

Copilot AI Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function 'discard_segments' documentation should be updated to reflect the use of 'config.min_segment_length_s' and 'config.sampling_frequency' instead of 'min_length_segment_s' and 'sampling_frequency'.

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
@Erikpostt
Copy link
Contributor Author

@KarsVeldkamp ready for review. Ik heb nog een aantal aanpassingen gemaakt in de docstrings en variabelnamen voor consistentie, die had ik eigenlijk in een volgende PR moeten doen maar mijn perfectionisme zat me dwars. Excuses voor deze toevoeging!

list
The aggregated statistics
"""
def compute_statistics(data: np.ndarray, statistic: str) -> np.ndarray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ziet er goed uit en lijkt idd een stuk efficiënter zo, maar wil je hier ook nog doc strings toevoegen?

sampling_frequency: int = 100,
) -> tuple:
"""Compute the Fast Fourier Transform (FFT) of a signal per window (can probably be combined with compute_fft and simplified).
def compute_power_in_bandwidth(config, psd, freqs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem en wat bedoel je inde functie met band_mask (miss kun je nog wat comments plaatsen want vond deze niet self-explaining)

if df.loc[index, angle_colname][df.loc[index, f'{angle_colname}_new_minima'][i_pks+1]] < df.loc[index, angle_colname][df.loc[index, f'{angle_colname}_new_minima'][i_pks]]:
df.at[index, f'{angle_colname}_new_minima'] = np.delete(df.loc[index, f'{angle_colname}_new_minima'], i_pks)
# otherwise, keep the current minimum and discard the next minimum
distances = config.sampling_frequency * 0.6 / dominant_frequencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zijn deze 2 values nog iets voor in de config?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deze line plus de volgende van prominence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prominence mogelijk wel, maar de dominant_frequencies is een array met een dominant frequency per window. Ik denk dat ik 'm voor nu binnen deze functie houd, omdat ik prominence niet ergens anders gebruik nog.

scaler.scale_ = scaler_params['scale']
scaler.feature_names_in_ = scaler_params['features']

df[scaler_params['features']] = scaler.transform(df[scaler_params['features']])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dit bedoelde je dus ook bij mijn stukje, zal dit aanpassen ;)

# window_step_size_s=config.window_step_length_s,
# metrics=['range_of_motion', f'peak_{config.velocity_colname}'],
# aggregates=['median'],
# quantiles=[0.95]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wil je de quantiles nog naar je config zetten? Weet dat hij nu uitgecomment is dus miss was je dat straks ook al van plan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goeie, had ik nog niet over nagedacht maar ga ik zeker doen!

0.18271208243897705
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eea opnieuw getraind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeker, een aantal spectral features zijn aangepast (o.a. MFCCs die nu mel-scale gebruiken)

@KarsVeldkamp KarsVeldkamp marked this pull request as ready for review December 3, 2024 13:07
Copy link
Contributor

@KarsVeldkamp KarsVeldkamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Erik,

Ziet er goed uit, paar kleine dingetjes in de comments maar het lijkt idd een stuk efficiënter te zijn! Goed werk!

@Erikpostt Erikpostt merged commit 5ed7fd4 into main Dec 3, 2024
1 check passed
@Erikpostt Erikpostt deleted the increase_efficiency_gait branch December 4, 2024 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants