Trying to get an idea about where the algorithm is flexible #2

hohMiyazawa · 2021-01-20T11:11:51Z

If you don't mind me poking around a bit more :)

This is mostly based off of your description, though I checked out some relevant parts of the code.

Clustering/grid

This seems to me the most interesting part. Typically, compression algorithms aim to partition the input into chunks of uncorrelated data, to use a set of parameters tailored for each individual chunck.

In this regard, this looks much more sophisticated than what's usually done, namely slicing the image up into a grid (which you also do as the next step). Would it be possible to get some sort of estimate for how much storage the clustering metadata demands? (an x times y grid, for instance, only has two numbers worth of overhead).

From what I understand, your approach is not limited to a specific algorithm for deciding on the clusters for unique decodability. This should mean that there's the option for asymmetric encoding, where the decoder is computationally inexpensive in that it just has to decode only the single layout you ended up with, while the encoder is free to try multiple different approaches.

Calculating Lists

Every cluster has a minimum pixel value. This value is according to the minimum R, G, B value in that chunk. The resulting pixel value is an addition of this chunk value and the encoded pixel value.

It seems to me, you are using a very simple predictive coding approach here. The "predictor" is a single value for each chunck/cluster.
This would give you very few of the benefits the cluster decorrelation potentially gives. Zstd should be able to explot much of the redundancy of neighbouring pixels in your list, but the 2D correlation is somewhat wasted.
While I'm not sure what the geometry of your "wiggly lines", an analogue technique for scanline traversal would be to store each pixel value as the difference from the pixel above, which would otherwise be far away in the list, and out of reach for the entropy coding.

hohMiyazawa · 2021-01-20T11:38:43Z

To illustrate my point, here is a very simple scheme:

Read image as raw RGB bytes.
Traverse the image in scanline order, storing each R, G and B value as the difference from the line above.
Compress this with zstd.

(as an extreme example)

Using a best-worse file size graph again, this very simple predictor has considerably better performance than just using zstd directly.

My suggestion: You should look into using a spatial predictor, even a very simple one, in your "calculating lists" step.
With your advanced partitioning, you will likely also have some gains by selecting a different predictor for each cluster.

umgefahren · 2021-01-21T07:48:00Z

Have you already implemented this method or what's the source of your graph?

umgefahren · 2021-01-21T08:02:54Z

This idea of yours is pretty good and I'm confident that it will improve the system significantly. However it will take me a long time to implement. Here is why:
As you have seen Rust is static typed. This is the issue. Your proposal would require the use of signed 8-bit integers instead of unsigned 8-bit integers, that I'm currently using. This is important, because if we calculate lists in relation to the previous line it's possible, that we have a decrease in one or multiple of the RGB values. This decrease requires signed integers. This is, again, probably not an issue. The refactor is just immense.

hohMiyazawa · 2021-01-21T09:47:15Z

Yes, I implemented this transform to make the graph.

You can probably get away without using signed integers by storing the diffs with an offset of 128:

RED1 - RED2 potentially has a range of -255 to 255. However, we can notice that a diff of 128 is equivalent to a diff of -128, a diff of 129 equivalent to -127, and so on.

The real range is then -128 to 127. If you store that +128, you have the range 0-255, which should fit in an unsigned integer.

umgefahren · 2021-01-21T09:50:19Z

Could you upload the the implementation to your fork and make a pull request. If the system passes all tests we can use it directly.
Otherwise, I will implement it.

hohMiyazawa · 2021-01-21T10:05:54Z

I wrote this as a separate c++ program, sadly.
It's probably also highly dependent on how your "wiggly lines" work, as I just used a simple scanline approach.

umgefahren · 2021-01-21T13:48:46Z

I'm working on it. Can't make any promises though. I've been trying to implement it for the last hour. The hole encoder and decoder is pretty bad organized and coded. It's also very hard to debug, because you never know where the error is. Could be in the encoder or the decoder.
How did you write your C++ Code? Did you rewrite the hole algorithm? Maybe I don't quite get yet what your plan is.

I'm graduating in several weeks from high school so I have only two weeks max to bring the code into a reasonable state.

Nevertheless I highly appreciate your contributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to get an idea about where the algorithm is flexible #2

Trying to get an idea about where the algorithm is flexible #2

hohMiyazawa commented Jan 20, 2021

hohMiyazawa commented Jan 20, 2021 •

edited

Loading

umgefahren commented Jan 21, 2021

umgefahren commented Jan 21, 2021

hohMiyazawa commented Jan 21, 2021

umgefahren commented Jan 21, 2021

hohMiyazawa commented Jan 21, 2021

umgefahren commented Jan 21, 2021

Trying to get an idea about where the algorithm is flexible #2

Trying to get an idea about where the algorithm is flexible #2

Comments

hohMiyazawa commented Jan 20, 2021

Clustering/grid

Calculating Lists

hohMiyazawa commented Jan 20, 2021 • edited Loading

umgefahren commented Jan 21, 2021

umgefahren commented Jan 21, 2021

hohMiyazawa commented Jan 21, 2021

umgefahren commented Jan 21, 2021

hohMiyazawa commented Jan 21, 2021

umgefahren commented Jan 21, 2021

hohMiyazawa commented Jan 20, 2021 •

edited

Loading