multiscales subsetting #90

HelenaLC · 2024-11-19T11:01:19Z

It's not clear to me how to implement subsetting of multiscale ZarrArrays. E.g., say we have scales 4x4 and 2x2. Then the only subsets that could be propagated to all resolutions would be 1..2 and 3..4, which would keep the 1st and 2nd low-res. index, respectively. How is this handled on the Python side? Shall we simply restrict to valid indices (i.e., throw an error)? Or drop low-res. scales that don't comply, e.g., when subsetting 2..3 (potentially with an argument drop=TRUE/FALSE, default FALSE, that would make users aware/in control of this)? ...perhaps worth thinking about whether Mike Smith has this on his radar, i.e., representing the pyramid beyond one layer at a time.

The text was updated successfully, but these errors were encountered:

Artur-man · 2024-11-19T12:22:45Z

Let me give you an idea here, this subsetting strategy work very well in when you do subset based zooming in voltron using pyramid images:

https://github.com/BIMSBbioinfo/ImageArray
https://x.com/AltunaAkalin/status/1841864608073572707

ImageArray is programmed the same way that the current multiscale is implemented, when you subset you use the indices of the first scale, and the indices should be scaled down to subset other layers lazily.

HelenaLC · 2024-11-19T13:16:26Z

Thanks for the link, and yeah, I got that part (i.e., scaling down indices to subset low-res.). However, this doesn't resolve my confusion. Specifically,

the example given for ImageArray works because the numbers nicely scale down
e.g., 2000:3000 (length 1000) becomes 250:375 (length 125) ...or something like that
however, what would happen for the example I gave above, when high-res. indices cannot be scaled to low-res.?
e.g., (from above) arrays 4x4 and 2x2 and index 2:3 wouldn't make any sense for the 2x2 res.

HelenaLC · 2024-11-19T13:23:27Z

Based on the code here, I would answer my question by: the low-res. is kept completely in case indices don't map. E.g.,

# imagine a 4x4 array 'a', and calling 'a[2:3, ]';
# here, 'i=2' would be a 2x2 array
> curind <- c(2, 3); i <- 2
> seq(
  floor(head(curind,1)/(2^(i-1))), 
  ceiling(tail(curind,1)/(2^(i-1))))
[1] 1 2

Artur-man · 2024-11-19T13:54:10Z

I thought a lot about this when I was programming pyramids, for large images this small ceilling and floor operations to not really matter (you can interactively see it), but for small examples it looks a bit weird.

HelenaLC · 2024-11-19T13:57:30Z

Yeah, it's quite intelligent/elegant that little seq() you got there! I think I'll use this for now. I just got confused when subsetting the blobs and nothing happens. The whole pyramid thing is something that hopefully will come in from else where (e.g., Rarr) eventually... till then, we'll just hack our way through it :)

vjcitn · 2024-11-19T14:53:49Z

Just sticking my nose in. Comments in code can be really crucial, maybe linking back to these issues. Explaining the seq chunk fullly may ward off technical debt payments down the road; apologies if it is already there.
Also if there is anticipation of new functionality in Rarr do we need to raise an issue there? I am swamped with other items so not contributing much now but let me know if I am blocking anything.

HelenaLC · 2024-11-24T09:35:40Z

Alright, I have implemented the above as a temporarily working version. However, I still believe this needs to be given more thought... Specifically, [ currently only supports sequential indexing (i.e., must be a seq). I could imagine, however, that one would want to exclude part of the image somewhere, in which case we'd want to give an option to set that to NA (= black) or something, rather then crop (i.e., distorting the image)... Also, the whole rounding business (floor/ceiling) is fine for visualizations, but might mess things up for actual computations on the image (if these aren't carried out on the highest resolution), unless we restrict to indices that can be propagated across all layers... etc. etc. etc. So yeah, a pyramidal array class isn't so easy, and maybe worth having it be implemented independently eventually.

Artur-man · 2024-11-24T21:30:01Z

This is a good point and I think we should bring this up in the next SpatialData meeting at scverse. Now that I think about it, we have never tried this on SpatialData-python right ? does it do floor/ceiling ?

Also one question though, what operation you believe will mess up the images if the floor/ceiling applied to scaled indices ?

The only case I ever imagine the pyramid scheme is in fact needed ... is for visualisation where you load the lower memory allocating array. Everything else, say learning embeddings and making predictions on image tiles could be done on the actual large image.

Lets think more though i agree ....

HelenaLC added help wanted Extra attention is needed question Further information is requested labels Nov 19, 2024

HelenaLC added a commit that referenced this issue Nov 24, 2024

addressing #90; uncertainty remains...

68bbfaa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiscales subsetting #90

multiscales subsetting #90

HelenaLC commented Nov 19, 2024 •

edited

Loading

Artur-man commented Nov 19, 2024 •

edited

Loading

HelenaLC commented Nov 19, 2024

HelenaLC commented Nov 19, 2024 •

edited

Loading

Artur-man commented Nov 19, 2024 •

edited

Loading

HelenaLC commented Nov 19, 2024

vjcitn commented Nov 19, 2024

HelenaLC commented Nov 24, 2024 •

edited

Loading

Artur-man commented Nov 24, 2024

multiscales subsetting #90

multiscales subsetting #90

Comments

HelenaLC commented Nov 19, 2024 • edited Loading

Artur-man commented Nov 19, 2024 • edited Loading

HelenaLC commented Nov 19, 2024

HelenaLC commented Nov 19, 2024 • edited Loading

Artur-man commented Nov 19, 2024 • edited Loading

HelenaLC commented Nov 19, 2024

vjcitn commented Nov 19, 2024

HelenaLC commented Nov 24, 2024 • edited Loading

Artur-man commented Nov 24, 2024

HelenaLC commented Nov 19, 2024 •

edited

Loading

Artur-man commented Nov 19, 2024 •

edited

Loading

HelenaLC commented Nov 19, 2024 •

edited

Loading

Artur-man commented Nov 19, 2024 •

edited

Loading

HelenaLC commented Nov 24, 2024 •

edited

Loading