Increase Watermark Robustness #20

mhellmeier · 2024-03-19T13:24:12Z

🚀 Feature Request

Current Problem

When changing a watermarked text, the watermark inside the cover text can get destroyed. This can occur by moving sentences inside a text, deleting content, adding new content, or copying existing content.

Proposed Solution

The overall robustness of the watermarker library needs to be increased.

One possible example:
If a small watermark is included inside an extended cover text (e.g., 10 times), the watermarker library should be able to extract the watermark even if 4 of the 10 watermark repetitions got destroyed.

Additional Context

If a control char is implemented (see #18), the watermarker library needs a strategy if this control char gets destroyed.

mhellmeier · 2024-05-24T14:19:40Z

First starting points:

Update the squashing method for all watermarks so that it only returns one watermark based on the length (robustness increase for all watermarks)
Update the SizedWatermark so that it uses the size of the watermark (robustness increase for Sized Trendmarks)
Add optional parameter for the watermark extraction that gives information if multiple unique watermarks might be included (like multipleWatermarks = true). Optionally add a suggested number of watermarks counts if the previous parameter was set to true.
Check error correcting codes for the CRC32 trendmarks
Update Trendmark documentation with recommendations for robustness (for example, suggest using the SizedCRC32Watermark if robustness is important and mention the trade-offs).

Minor thing:

Update all the naming from SizedWatermark to SizedTrendmark etc.

Future ideas:

Usage of large language models to let the model check which was the original watermark
An optional toggle that filters for or prefers watermarks composed of "sensible" Unicode characters only (i.e. extended Latin alphabet, Arabic numerals, ampersand etc.)

hnorkowski · 2024-06-28T08:53:46Z

First starting points:

* [ ]  Update the squashing method for all watermarks so that it only returns one watermark based on the length (robustness increase for all watermarks)

Finding the most plausible watermark on basic watermarks (i.e. just a list of bytes and you know nothing about what it represents) can be done with frequency analysis. This approach is generic and could be implemented as static function in Watermark. It needs to be a static function instead of a method because it requires taking a list of watermarks. The Watermarker and JvmWatermark could use the feature by default.

* [ ]  Update the `SizedWatermark` so that it uses the size of the watermark (robustness increase for Sized Trendmarks)

The basic usage is already implemented. The validate method of Trendmark checks for correct size, checksum, and hash (depending on the variant of Trendmark).

* [ ]  Check error correcting codes for the CRC32 trendmarks

The validation of the checksum is already implemented in the validateChecksum method that will be automatically called when calling the validate function. Maybe it is possible to correct errors with the CRC32 codes, I am not sure because CRC32 can correct single bit errors but our text watermarking alg. does not work with bits. currently its works with 4 states. A new method repair could be added to the Checksum interface and then every implementing checksum can implement a recovery strategy according to the specific checksum, if possible.

* [ ]  Update Trendmark documentation with recommendations for robustness (for example, suggest using the `SizedCRC32Watermark` if robustness is important and mention the trade-offs).

Further analysis methods could be implemented as static method in Trendmark. The extracting methods of (Jvm)Watermarker could be extended by another parameter trashing: Bool that defines if all Trendmarks which produce an error or warning are thrown away.

mhellmeier added feature New feature or request component: watermarker Watermarker Library labels Mar 19, 2024

mhellmeier assigned hnorkowski May 24, 2024

hnorkowski mentioned this issue May 29, 2024

refactor(watermarker): rename Textmark to TextWatermark #65

Merged

mhellmeier unassigned hnorkowski Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase Watermark Robustness #20

Increase Watermark Robustness #20

mhellmeier commented Mar 19, 2024 •

edited

Loading

mhellmeier commented May 24, 2024 •

edited by Schiphorst-ISST

Loading

hnorkowski commented Jun 28, 2024 •

edited

Loading

First starting points:

Increase Watermark Robustness #20

Increase Watermark Robustness #20

Comments

mhellmeier commented Mar 19, 2024 • edited Loading

🚀 Feature Request

Current Problem

Proposed Solution

Additional Context

mhellmeier commented May 24, 2024 • edited by Schiphorst-ISST Loading

First starting points:

Minor thing:

Future ideas:

hnorkowski commented Jun 28, 2024 • edited Loading

First starting points:

mhellmeier commented Mar 19, 2024 •

edited

Loading

mhellmeier commented May 24, 2024 •

edited by Schiphorst-ISST

Loading

hnorkowski commented Jun 28, 2024 •

edited

Loading