Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding feature extraction #77

Merged
merged 18 commits into from
Dec 2, 2024
Merged

Adding feature extraction #77

merged 18 commits into from
Dec 2, 2024

Conversation

KarsVeldkamp
Copy link
Contributor

@KarsVeldkamp KarsVeldkamp commented Nov 26, 2024

Ha Erik,

Hierbij de PR voor feature extraction. Heb op het eind nog even wat dingen aangepast zodat het echt alleen relevant is voor feature extraction (en wat preprocessing/restructuring)

Comment on lines +53 to +60
l_signal_to_noise_ratios = []
for segment in ppg_segments:
arr_signal = np.var(segment)
arr_noise = np.var(np.abs(segment))
signal_to_noise_ratio = arr_signal / arr_noise
l_signal_to_noise_ratios.append(signal_to_noise_ratio)

return l_signal_to_noise_ratios
Copy link
Contributor

@Erikpostt Erikpostt Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je kan ook gebruik maken van Numpy arrays hier, dan hoef je niet te loopen. Ik weet niet of dit precies zo werkt, maar ongeveer. Dan is je return ook een np.ndarray ipv een list.

Suggested change
l_signal_to_noise_ratios = []
for segment in ppg_segments:
arr_signal = np.var(segment)
arr_noise = np.var(np.abs(segment))
signal_to_noise_ratio = arr_signal / arr_noise
l_signal_to_noise_ratios.append(signal_to_noise_ratio)
return l_signal_to_noise_ratios
arr_signal = np.var(ppg_segments, axis=1)
abs_signal = np.abs(ppg_segments, axis=1)
arr_noise = np.var(abs_signal, axis=1)
signal_to_noise_ratio = arr_signal / arr_noise
return signal_to_noise_ratio

Compute the autocorrelation of the PPG signal.

Args:
ppg_signal (np.ndarray): 2D array where each row is a segment of the PPG signal.
Copy link
Contributor

@Erikpostt Erikpostt Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dtype is duidelijk hierboven. Idem voor parameters in andere functies.

Suggested change
ppg_signal (np.ndarray): 2D array where each row is a segment of the PPG signal.
ppg_segments: 2D array where each row is a segment of the PPG signal.

autocorrelations = biased_autocorrelation(segment, fs*3)
peaks, _ = find_peaks(autocorrelations, height=0.01)
peak_values = autocorrelations[peaks]
sorted_peaks = np.sort(peak_values)[::-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Klopt het dat je ze hier sorteert in descending order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, want wil de hoogste piek hebben

Copy link
Contributor

@Erikpostt Erikpostt Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dan zou je eventueel ook het volgende kunnen doen (bijvoorbeeld):

l_auto_correlations.append(np.max(peak_values, initial=0))  # extract the highest peak

Maar dat laat ik aan jou voor later ;)

Returns:
np.ndarray: Biased autocorrelation values for lags 0 to max_lag.
"""
x = np.array(x) # Ensure x is a numpy array instead of a list
Copy link
Contributor

@Erikpostt Erikpostt Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik weet niet precies wat je wil, maar je zou ook gebruik kunnen maken van np.asarray: https://stackoverflow.com/questions/14415741/what-is-the-difference-between-np-array-and-np-asarray

TLDR: np.asarray maakt geen copy, wat np.array wel doet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eens, maar zie nu eig ook dat hij overbodig is want de input is sws al een np.array


return l_auto_correlations

def biased_autocorrelation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je past deze functie nu per segment toe, maar je kan wellicht ook de input een 2D array maken. Alle numpy functies werken daar iig prima op (dmv axis parameter).

Comment on lines +110 to +115
autocorr_values = np.zeros(max_lag + 1)

for lag in range(max_lag + 1):
# Compute autocorrelation for current lag
overlapping_points = x[:N-lag] * x[lag:]
autocorr_values[lag] = np.sum(overlapping_points) / N # Divide by N (biased normalization)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wellicht kan je hier gebruik maken van np.correlate, dat wordt dan iets als: autocorr_values = np.correlate(x, x, mode='full')[N-1:N+max_lag] / N.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zou ik even later moet checken, want ben hier even mee bezig geweest om exact dezelfde output te krijgen als in matlab

"""
Calculate relative power within the dominant frequency band in the physiological range (0.75 - 3 Hz).
"""
hr_range_idx = np.where((freqs >= 0.75) & (freqs <= 3))[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik zou de frequency band voor HR ergens wegschrijven in je config file.

peak_idx = np.argmax(psd[hr_range_idx])
peak_freq = freqs[hr_range_idx[peak_idx]]

dom_band_idx = np.where((freqs >= peak_freq - 0.2) & (freqs <= peak_freq + 0.2))[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0.2 hier een marge die je pakt? Dan zou ik ergens de margin definiëren als variabele in de functie (of in je config)

Comment on lines 242 to 244
df_windowed[f'f_dom'] = l_dominant_frequencies
df_windowed[f'rel_power'] = l_relative_powers
df_windowed[f'spectral_entropy'] = l_spectral_entropies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ook hier en bij andere kolomnamen soms: de f is alleen nodig als je variabelen wilt includeren in de kolomnaam.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ja eens, had dat in eerste instantie wel zo staan, vergeten aan te passen, nu gedaan


for segment in ppg_segments:
# Compute power spectral density (PSD) once using Welch's method
freqs, psd = welch(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dit kan ook met vectorizatie dmv numpy, maar daar kunnen we later een keer naar kijken samen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lijkt mij goed

Comment on lines 70 to 72
# self.ppg_colname: List[str] = [
# DataColumns.PPG
# ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dit kan weg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Comment on lines +153 to +155
self.sqa_window_overlap_s: int = 5
self.sqa_window_step_size_s: int = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deze informatie staat ook bij feature extraction, dus ik zou deze ook bij PPGConfig zetten zodat je ze maar één keer hoeft te definiëren.

Copy link
Contributor

@Erikpostt Erikpostt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goed werk! Loop even de comments en suggesties na en kijk maar wat je ermee wilt doen. Ik zie geen grote dingen die per se anders moeten, dus ik kan het mergen als jij er klaar voor bent.

@biomarkersParkinson biomarkersParkinson deleted a comment from Erikpostt Nov 27, 2024
@KarsVeldkamp
Copy link
Contributor Author

@Erikpostt Comments verwerkt ;)

if statistic == 'mean':
return [np.mean(np.abs(x)) for x in sensor_col]
elif statistic == 'var':
return [np.var(x, ddof=1) for x in sensor_col] # ddof=1 for unbiased variance is used, same as matlab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Same as Matlab" is wellicht verwarrend voor een gebruiker, of niet?

Comment on lines +107 to +108
self.window_step_size_s: int = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waarschijnlijk heb je dit van mij overgenomen, dus ik kan je dit niet kwalijk nemen, maar wellicht wil je deze als float hebben.


self.freq_band_physio = [0.75, 3] # Hz
self.bandwidth = 0.2 # Hz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wat is dit precies?

self.sqa_window_overlap_s: int = 5
self.sqa_window_step_size_s: int = 1
min_window_length = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is deze weer in seconden?


# Heart rate estimation parameters
hr_est_length = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

En deze?

Copy link
Contributor

@Erikpostt Erikpostt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ziet er prima uit! Heb wat vragen gesteld over je code, maar dit zit de merge niet in de weg. Kleine suggesties wellicht voor de volgende keer.

@Erikpostt Erikpostt merged commit 9e68513 into main Dec 2, 2024
1 check passed
@KarsVeldkamp KarsVeldkamp deleted the feature_extraction_ppg branch December 3, 2024 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants