high_volubility.py: chunk sizes #125

GladB · 2019-06-03T19:07:04Z

It is possible to modify the size of the chunks for the first, second and third extraction steps with the --chunk_sizes argument, however the new_onset_*_minutes function used to compute the onsets to extract those chunks only exists for 2 (second extraction step) and 5 (third extraction step) minutes:

in new_onsets_two_minutes the new onset seems to be later than the given onset, meaning the extracted chunk which was supposed to be centered on the smaller chunks starts in the middle of the smaller chunk, is that on purpose?
in in new_onsets_five_minutes the new onset is 2 minutes before the current onset, no matter the length asked for (it could be that the --chunk_sizes argument was [10.0, 120.0, 120.0] and then none of the chunks would contain the data based on which they were ranked)

I don't know which behavior was expected, but the second one at least can be conflicting with what the script is supposed to output.

The text was updated successfully, but these errors were encountered:

fmetze · 2019-06-04T18:34:23Z

@alecristia - you will know best how this is supposed to work?

alecristia · 2019-06-17T14:00:51Z

No, sorry - and the second one looks like a bug, so I'm tagging Marvin

…ally wrt #122 and #125. may still need improvements to cmd line parameters, expected behavior, or documentation/ code match

MarvinLvn · 2019-06-20T19:37:30Z

It is possible to modify the size of the chunks for the first, second and third extraction steps with the --chunk_sizes argument, however the new_onset_*_minutes function used to compute the onsets to extract those chunks only exists for 2 (second extraction step) and 5 (third extraction step) minutes.

Yep this functions needs the previous list of onsets. Therefore, the first one must be computed differently, with the select_onsets function

in new_onsets_two_minutes the new onset seems to be later than the given onset, meaning the extracted chunk which was supposed to be centered on the smaller chunks starts in the middle of the smaller chunk, is that on purpose?

With the following parameters :

a) a wav file of 3000 seconds
b) --chunk_sizes 10.0 120.0 300.0
c) --nb_chunks 2
d) --step 600

We compute the onsets of the 10 seconds chunks (each of them being separated by 600 sec), these onsets are :
[145.0, 745.0, 1345.0, 1945.0, 2545.0]
We keep the nb_chunks * 2 of them that contain the most speech, and we compute the onsets of the new chunks (sorted by amount of speech) :
[690.0, 90.0, 1290.0, 2490.0]

745 became 690, 145 became 90, etc ... :

We keep the nb_chunks of them that contain the most speech, and we compute the onsets of the new chunks (the ones that will be returned by the script) :

[600.0, 0.0]

690 became 600, 90 became 0, etc ...

Going back to first list of onsets ([145.0, 745.0, 1345.0, 1945.0, 2545.0]), we see that we chose :

The second one, whose first chunk was starting at 745 (centered at 750).
The first one, whose first chunk was starting at 145.0 (centered at 150).

in new_onsets_five_minutes the new onset is 2 minutes before the current onset, no matter the length asked for (it could be that the --chunk_sizes argument was [10.0, 120.0, 120.0] and then none of the chunks would contain the data based on which they were ranked)

The bugs you are describing might have been fixed by this commit

fmetze assigned alecristia Jun 4, 2019

alecristia assigned MarvinLvn Jun 17, 2019

fmetze added a commit that referenced this issue Jun 18, 2019

fixed high_volubility.py to successfully process long files, specific…

e2a7728

…ally wrt #122 and #125. may still need improvements to cmd line parameters, expected behavior, or documentation/ code match

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

high_volubility.py: chunk sizes #125

high_volubility.py: chunk sizes #125

GladB commented Jun 3, 2019

fmetze commented Jun 4, 2019

alecristia commented Jun 17, 2019

MarvinLvn commented Jun 20, 2019 •

edited

Loading

high_volubility.py: chunk sizes #125

high_volubility.py: chunk sizes #125

Comments

GladB commented Jun 3, 2019

fmetze commented Jun 4, 2019

alecristia commented Jun 17, 2019

MarvinLvn commented Jun 20, 2019 • edited Loading

MarvinLvn commented Jun 20, 2019 •

edited

Loading