Based on what classifiers use, define "Semantically Compressed" data set #200

wmwv · 2023-07-13T18:16:52Z

Based on the information used by downstream classifiers, identify what is the necessary information to keep in Alert packets while still retaining significant utility.

The current LSST packets are very loosely half pixels and half derived numbers.

Pixel-based classifiers. Should these always assume to just use the full alert packet? If it's pixel-based, maybe it doesn't need most of the other numbers?
Non-pixel-based classifiers. Drop stamps. Do they use the summary statistics?

troyraen · 2023-07-13T19:15:09Z

The current module called "lite" does this semantic compression, so (rephrasing the issue) we should revisit which fields that module is dropping vs retaining.

Currently:

All fields that are needed by downstream modules within the broker are kept, plus a couple more (I think) but not many. We should revisit with an eye toward what downstream users might want/need.
The stamps are dropped, so pixel-based classifiers need to get the alert packet from Cloud Storage. There is a technical issue with including the stamps in the Pub/Sub stream that is not insurmountable but would need to be addressed. (I'd have to refresh my memory on the details, but it's something to do with the fact that the stamps, as provided, cannot be re-serialized to JSON. Most of our streams use JSON because that's Pub/Sub's default, but Avro is an option and maybe we should do that anyway to more closely match the format sent by surveys.)

wmwv · 2023-07-13T23:58:26Z

So a worked example of a pixel-based classifier would be helpful to understand the performance and expense of pulling the packet from Cloud Storage.

troyraen added Enhancement New feature or request Discussion Requesting feedback from specialists, and discussion amongst those interested Pipeline: Science Components producing science output labels Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Based on what classifiers use, define "Semantically Compressed" data set #200

Based on what classifiers use, define "Semantically Compressed" data set #200

wmwv commented Jul 13, 2023

troyraen commented Jul 13, 2023

wmwv commented Jul 13, 2023

Based on what classifiers use, define "Semantically Compressed" data set #200

Based on what classifiers use, define "Semantically Compressed" data set #200

Comments

wmwv commented Jul 13, 2023

troyraen commented Jul 13, 2023

wmwv commented Jul 13, 2023