Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add consecutive_overlapping_subspans #19

Merged
merged 10 commits into from
Sep 20, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "AlignedSpans"
uuid = "72438786-fd5d-49ef-8843-650acbdfe662"
authors = ["Beacon Biosignals, Inc."]
version = "0.2.5"
version = "0.2.6"

[deps]
ArrowTypes = "31f734f8-188a-4ce0-8406-c8a06bd891cd"
Expand Down
2 changes: 1 addition & 1 deletion src/AlignedSpans.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ using TimeSpans: TimeSpans, start, stop, format_duration
using StructTypes, ArrowTypes

export SpanRoundingMode, RoundInward, RoundSpanDown, ConstantSamplesRoundingMode
export AlignedSpan, consecutive_subspans, n_samples
export AlignedSpan, consecutive_subspans, n_samples, consecutive_overlapping_subspans

# Make our own method so we can add methods for Intervals without piracy
duration(span) = TimeSpans.duration(span)
Expand Down
64 changes: 56 additions & 8 deletions src/utilities.jl
Original file line number Diff line number Diff line change
Expand Up @@ -13,23 +13,71 @@ Returns the number of samples present in the span `aligned`.
n_samples(aligned::AlignedSpan) = length(indices(aligned))

"""
consecutive_subspans(span::AlignedSpan, duration::Period)
consecutive_subspans(span::AlignedSpan, duration::Period; keep_last=true)
consecutive_subspans(span::AlignedSpan, n::Int; keep_last=true)

Creates an iterator of `AlignedSpan` such that each `AlignedSpan` has consecutive indices
which cover all of the original `span`'s indices. In particular,
which cover the original `span`'s indices (when `keep_last=true`).
ericphanson marked this conversation as resolved.
Show resolved Hide resolved

* Each span has `n = n_samples(span.sample_rate, duration)` samples, except possibly
* If `keep_last=true` (the default behavior), then the last span may have fewer samples than the others, and
* Each span has `n` samples (which is calculated as `n_samples(span.sample_rate, duration)` if `duration::Period` is supplied), except possibly
the last one, which may have fewer.
* The number of subspans is given by `cld(n_samples(span), n)`
* The number of samples in the last subspan is `r = rem(n_samples(span), n)` unless `r=0`, in which
* The number of subspans is given by `cld(n_samples(span), n)`,
* The number of samples in the last subspan is `r = rem(n_samples(span), n)` unless `r=0`, in which
case the the last subspan has the same number of samples as the rest, namely `n`.
* All of the indices of `span` are guaranteed to be covered by exactly one subspan
* If `keep_last=false`, then all spans will have the same number of samples:
* Each span has `n` samples (which is calculated as `n_samples(span.sample_rate, duration)` if `duration::Period` is supplied)
* The number of subspans is given by `fld(n_samples(span), n)`
* The last part of the `span`'s indices may not be covered (when we can't fit in another subspan of length `n`)
"""
function consecutive_subspans(span::AlignedSpan, duration::Period)
function consecutive_subspans(span::AlignedSpan, duration::Period; keep_last=true)
n = n_samples(span.sample_rate, duration)
return consecutive_subspans(span::AlignedSpan, n)
return consecutive_subspans(span::AlignedSpan, n; keep_last)
end

function consecutive_subspans(span::AlignedSpan, n::Int)
function consecutive_subspans(span::AlignedSpan, n::Int; keep_last=true)
index_groups = Iterators.partition((span.first_index):(span.last_index), n)
if !keep_last
r = rem(n_samples(span), n)
if r != 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the tests test this branch, bc originally I accidentally had r==0 and they failed!

# Drop the last element
grps = Iterators.take(index_groups, fld(n_samples(span), n))
return (AlignedSpan(span.sample_rate, first(I), last(I)) for I in grps)
end
end
return (AlignedSpan(span.sample_rate, first(I), last(I)) for I in index_groups)
end

"""
consecutive_overlapping_subspans(span::AlignedSpan, duration::Period,
hop_duration::Period)
consecutive_overlapping_subspans(span::AlignedSpan, n::Int, m::Int)
ericphanson marked this conversation as resolved.
Show resolved Hide resolved

Create an iterator of `AlignedSpan` such that each `AlignedSpan` has
`n` (calculated as `n_samples(span.sample_rate, duration)` if `duration::Period` is supplied) samples, shifted by
`m` (calculated as `n_samples(span.sample_rate, hop_duration)` if `hop_duration::Period` is supplied) samples between
consecutive spans.

!!! warning
When `n_samples(span)` is not an integer multiple of `n`, only AlignedSpans with `n`
samples will be returned. This is analgous to `consecutive_subspans` with `keep_last=false`, which is not the default behavior for `consecutive_subspans`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh. i wonder if we should add keep_last as a kwarg, for parity??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about that (see OP). I think it is a bit hard to know exactly what those semantics should be when there's overlap. I guess something like: if there's 8 fold overlap, the last 7 will all be shorter than usual, and the last one will be 1 sample long. But it's a bit weird. I'd rather leave that until someone needs it.


Note: If `hop_duration` cannot be represented as an integer number of samples,
rounding will occur to ensure that all output AlignedSpans will have the
same number of samples. When rounding occurs, the output hop_duration will be:
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
`Nanosecond(n_samples(samp_rate, hop_duration) / samp_rate * 1e9)`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this docstring is lightly edited from the private package it was taken from

"""
function consecutive_overlapping_subspans(span::AlignedSpan, duration::Period,
hop_duration::Period)
n = n_samples(span.sample_rate, duration)
m = n_samples(span.sample_rate, hop_duration)
return consecutive_overlapping_subspans(span::AlignedSpan, n, m)
end

function consecutive_overlapping_subspans(span::AlignedSpan, n::Int, m::Int)
index_groups = Iterators.partition((span.first_index):(span.last_index - n + 1),
m)
return (AlignedSpan(span.sample_rate, first(I), first(I) + n - 1)
for I in index_groups)
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is untouched from the private package in which it was copied from

84 changes: 84 additions & 0 deletions test/utilities.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,26 @@ function test_subspans(aligned, sample_rate, dur)
else
@test n_samples(subspans[end]) == n_samples(sample_rate, dur)
end

# Ends at the end
@test subspans[end].last_index == aligned.last_index

# all test w/ `keep_last=false`
return test_subspans_skip_last(aligned, sample_rate, dur)
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
end

function test_subspans_skip_last(aligned, sample_rate, dur)
@show aligned, sample_rate, dur
subspans = collect(consecutive_subspans(aligned, dur; keep_last=false))
@test length(subspans) == fld(n_samples(aligned), n_samples(sample_rate, dur))
for i in 1:(length(subspans) - 1)
@test subspans[i + 1].first_index == subspans[i].last_index + 1 # consecutive indices
@test n_samples(subspans[i]) == n_samples(sample_rate, dur) # each has as many samples as prescribed by the duration
end

# Does not necessarily end all the way at the end, but gets within `n`
@test aligned.last_index - n_samples(sample_rate, dur) <= subspans[end].last_index <=
aligned.last_index
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
end

@testset "consecutive_subspans" begin
Expand All @@ -37,3 +57,67 @@ end

@test_throws ArgumentError consecutive_subspans(aligned, Millisecond(1))
end

@testset "consecutive_overlapping_subspans" begin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole testset is untouched from the private package in which it was copied from

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(new stuff added in 8eabc1e)

# when window_duration == hop duration and window_duration fits into
# samples_span exactly n times, the output of consecutive_overlapping_subspans
# should equal that of consecutive_subspans
samples_span = AlignedSpan(128, 1, 128 * 120)
window_samples = 10 * 128
subspans = consecutive_overlapping_subspans(samples_span, window_samples,
window_samples)
og_subspans = consecutive_subspans(samples_span, window_samples)
@test all(collect(subspans) .== collect(og_subspans))

# Check w/ Period's
subspans2 = consecutive_overlapping_subspans(samples_span, Second(10),
Second(10))
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
@test all(collect(subspans) .== collect(subspans2))

# when window_duration == hop duration but window_duration does not
# fit evenly into samples_span, consecutive_subspans will return a
# last AlignedSpan with n_samples < n_samples(window_duration), whereas
# consecutive_overlapping_subspans will omit the last window and only
# return AlignedSpans with n_samples = n_samples(window_duration)
window_samples = 11 * 128
subspans = consecutive_overlapping_subspans(samples_span, window_samples,
window_samples)
og_subspans = consecutive_subspans(samples_span, window_samples)
c_subspans = collect(subspans)
@test length(collect(og_subspans)) - 1 == length(c_subspans)
@test all(n_samples.(c_subspans) .== window_samples)

# Check w/ Period's
subspans2 = consecutive_overlapping_subspans(samples_span, Second(11),
Second(11))
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
@test all(collect(subspans) .== collect(subspans2))

# when hop_samples < window_samples
window_samples = 10 * 128
hop_samples = 5 * 128
n_complete_windows = fld((n_samples(samples_span) - window_samples), hop_samples) + 1
subspans = consecutive_overlapping_subspans(samples_span, window_samples, hop_samples)
c_subspans = collect(subspans)
@test length(c_subspans) == n_complete_windows
@test all(n_samples.(c_subspans) .== window_samples)

# Check w/ Period's
subspans2 = consecutive_overlapping_subspans(samples_span, Second(10),
Second(5))
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
@test all(collect(subspans) .== collect(subspans2))

# hop_samples < windows_samples and window_samples does not fit exactly into
# samples_span
window_samples = 11 * 128
hop_samples = 5 * 128
n_complete_windows = fld((n_samples(samples_span) - window_samples), hop_samples) + 1
subspans = consecutive_overlapping_subspans(samples_span, window_samples, hop_samples)
c_subspans = collect(subspans)
@test length(c_subspans) == n_complete_windows
@test all(n_samples.(c_subspans) .== window_samples)

# Check w/ Period's
subspans2 = consecutive_overlapping_subspans(samples_span, Second(11),
Second(5))
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
@test all(collect(subspans) .== collect(subspans2))
end