Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The character sequence "]]" is not written inside cdata section. One "]" is discarded. #100

Open
lepokle opened this issue Nov 6, 2024 · 0 comments

Comments

@lepokle
Copy link

lepokle commented Nov 6, 2024

Problem

Assume you have a cdata section with the following content

  This is a ]] > test.

(This is how Atlassian Confluence "escapes" end of CDATA within a CDATA section).
If you write this sequence using XMLStreamWriter2.writeCdata you'll the the following section in the output file

This is a ] > test.

Also the initial sequence should be fine from XML point of view.

Reason

Inside ByteXmlWriter.writeCDataContents each character is checked and written to output. If it comes to a ] character it checks the next character for another ] and a finally a >. In this case the end sequence is properly handled.
In case that any other character after the second ] is detected, it will continue writing characters but without writing the first ] character to the output buffer. So it gets lost.

Solution

The other case (no > character follows the double ]) must be properly handled by writing the first detected ] to the output as well:
(line 851)

if (offset < len && cbuf[offset] == ']') {
    if ((offset+1) < len && cbuf[offset+1] == '>') {
        // Ok, need to output ']]' first, then end
        offset += 2;
        writeRaw(BYTE_RBRACKET, BYTE_RBRACKET);
        writeCDataEnd();
        // Then new start, and '>'
        writeCDataStart();
        writeRaw(BYTE_GT);
    }
     else {
        // no end found, write first bracket
        if (_outputPtr >= _outputBufferLen) {
            flushBuffer();
        }
        _outputBuffer[_outputPtr++] = (byte) ch;
    }
    continue main_loop;
}
lepokle added a commit to lepokle/aalto-xml that referenced this issue Nov 6, 2024
Write first ] if a sequence of ]] bracktes without a following > is found.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant