-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Stax2 (Woodstox) instead of plexus-xml (Xpp) to process XML #1174
Conversation
…uments; reimplemented the mutable reader using an XMLStreamReader2, added some cleanup and more documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, we can try ... every ITs look ok
@andrzejj0 we can try with https://codehaus-plexus.github.io/modello/modello-maven-plugin/index.html |
@slawekjaranowski I tried. It almost worked. However, there's one problem. It seems that the reader generated by modello-stax is not forgiving when it comes to XSD validation. It is only able to parse the namespace matching the namespace pattern defined in the .mdo file. With the recent changes, we have the namespace of Making this plugin strict is an easy choice, but shortsighted. This will create a problem with backwards compatibility where people blindly updating the plugin or using the latest version will suddenly have a broken CI/CD pipeline because they've still been using the old namespace somewhere in their rules file and it will not work anymore. So, ideally, it would be great to have a lenient parser, like we apparently do in case of Xpp3. Should I file a bug/feature with Modello (I could even attempt to implement it)? |
It seems that strict namespace checking is built into // legacy hack for pomVersion == 3
if ( "3".equals( modelVersion ) )
{
modelVersion = "3.0.0";
}
if ( !"2.1.0".equals( modelVersion ) )
{
throw new XMLStreamException( "Document model version of '" + modelVersion + "' doesn't match reader version of '2.1.0'", xmlStreamReader.getLocation() );
} This actually requires that we're only getting version Next to introducing a feature to ignore namespace or just the version, maybe we should actually just create different model versions whenever we introduce a new namespace, and maintain as many namespace versions as we agree to support. That will surely make our code bigger and more difficult to maintain, but that's what you get when you introduce new namespaces, I guess. Until this is sorted I think that we should just keep Xpp3 for the time being. |
Alternatively, if we want to simply not use plexus-xml, we could use dom4j, but it requires dom4j:dom4j which currently has 2 open CVEs. |
we can try to fix modello-stax should be work similar as modello with plexus |
That's one option. Another option is to use modello-dom4j. Then we also don't use Plexus and we're not after performance here anyway. Unfortunately, dom4j:dom4j currently has 2 open CVEs. That held me from going for it. |
I briefly looked at the code of the Modello plugin. It looks like the version check is only added if block is present in the model definition file. I agree that the Modello plugin needs to behave in a consistent manner across parser types, so perhaps there needs to come a new argument there to choose whether this namespace check is needed. |
we can start a discussion about it at Modello issues https://github.com/codehaus-plexus/modello/issues |
Second stage of convergence with Maven 4 chages: this time getting rid of Plexus XML for document parsing.
Using Stax2 (Woodstox) instead of plexus-xml (Xpp) to process XMLuments; reimplemented the mutable reader using an XMLStreamReader2, added some cleanup and more documentation.
XMLStreamReader2
providing both the start and the end of the current token in itsLocationInfo
, it was not necessary to cache the next token like theModifiedXMLEventReader
did, which saves us a lot of complexity and makes the design more compatible with the delegate;hasNext()
does not proceed the cursor to the next event (like was the case with the previous implementation),next()
doesIt's not possbible to switch entirely to Stax2 because of Modello. But aside from that, most stuff has been replaced. Thanks to using a Stax parser instead of an event-based parser, we can also see some performance improvements (see integration test times for example).
Added some polish, cleanup, documentation (especially to
MutableXMLStreamReader
as it's the central piece of this whole plugin).It's quite a big change. Please review. Should we push this to 2.18.0 or postpone it a bit?