Skip to content

From Batik test suite to EchoSVG tests

carlosame edited this page Oct 20, 2022 · 6 revisions

One of the most important changes between Apache Batik and EchoSVG was the creation of a new testing infrastructure, and the rationale for that work is worth being described here.


The testing framework

Instead of using a standard testing framework like JUnit or TestNG, Batik developers created their own infrastructure based on XML configuration files. Adding or removing tests was done by editing the right XML configuration file, with its own set of XML configurations. The results of the test were returned through a DOM document and if an exception occurred, it was suppressed and most of the information that it carried was lost, making debugging very difficult.

The old testing framework even came with a small set of self-tests but was annoying to use and essentially unmaintained. The Batik project began to migrate its own tests to JUnit but did not get very far. Therefore, the decision was made to get rid of the old testing framework and replace it with JUnit.

But once this was achieved, it became apparent that it wasn't going to be the only challenge because the test suite had reproducibility issues.


SVG Generator tests

The SVG Generator allows to generate SVG through a Graphics2D object, and is one of the most popular use cases for Apache Batik. In Batik it is the batik-svggen module, in EchoSVG is echosvg-svggen.

The ability to compare the generated SVG document with a reference is essential to the svggen tests, and Batik did that by comparing the text serializations of both documents, line by line. That was good because when a difference was detected, the test would report the expected line versus the generated one, instead of just claiming that "the two documents differ".

The problem was that the Batik DOM implementation does not enforce any specific ordering in the serialization of element attributes, and sometimes it happened that the same DOM document had different serializations, even when produced by the same JDK in the same computer (but under slightly different circumstances). For example, the following line

<g fill="rgb(102,102,153)" text-rendering="optimizeLegibility" font-size="15px" font-weight="bold" stroke="rgb(102,102,153)"

is effectively the same as

<g fill="rgb(102,102,153)" text-rendering="optimizeLegibility" stroke="rgb(102,102,153)" font-size="15px" font-weight="bold"

but then a bogus test failure would be triggered. This happened from time to time, and was one of the reasons why the Batik tests were not 100% deterministic and reproducible.

I then rewrote the comparison so the attribute ordering did not matter, but sometimes the serialization was still varying by a couple of characters being taken from a line and added to the next.

So as a final check, a full DOM comparison was executed to avoid false test failures. Unfortunately, the document/node comparison methods in Batik's DOM were buggy and had to be fixed (see #18).


Fixing the omitted tests

A few svggen tests were not being run by Batik because it lacked the necessary features, but those were added to EchoSVG (see #16 and #17) and now all of Batik's Generator tests are run.

See one of the images from the fix to issue #16 (Batik vs EchoSVG):

font-decor-diff

Image comparisons

All of the rendering tests rely on comparing the produced image with a reference. In Batik the comparison is made at the file stream level, with the possibility that two images that are equal pixel-by-pixel are reported as different because of some difference in how the image file was produced.

Batik works around this by supporting variant images: if a comparison fails, then it looks for multiple variants under an accepted-variation directory. For example, a variant image produced with certain Java version made by certain JDK manufacturer on certain operating system may produce the correct match. But obviously, this approach is unreliable and unmaintainable; another big factor adding to the non-reproducibility of the test suite.

Instead, EchoSVG deploys a renewed image comparison infrastructure which is described in the IMAGE_COMPARISONS.md document. In short, the comparisons are now at the pixel level and the platform variants are now only used when there are reasons to expect important differences (like different system-default fonts or colors). A new type of variant called "range variant" was introduced, which is very rarely used and fits more closely to the reference image.


WMF tests

One of the capabilities of the Batik transcoder module is to transform a Windows Metafile document to SVG. When I executed the tests, I found that the conversion was hardware-dependent because Batik was using an archaic convention to transform inches to pixels, using the actual resolution of the device (which was something found in very old Web specifications).

I checked that Batik was producing images of a different size than other available WMF converters. A problem by itself, and yet another factor preventing test reproducibility.

I fixed the problem with commit 6ca74495. The images from that commit speak by themselves (Batik vs EchoSVG):

wmf-diff

The proprietary fonts

Nearly all of the Batik tests use fonts that are only available by default on Windows systems, and even others that are non-default and have to be installed manually (for example Courier). This was an important issue for cross-platform execution, not to talk about CI environments where it would be a pain to have to install the fonts before executing the tests.

I replaced most of the proprietary fonts with free alternatives which are provided by the test suite, and in the few cases where this was unsuitable (for example in the WMF sample files), the tests are only executed if the font is installed, otherwise counting as "ignored tests".


Conclusions

EchoSVG has a largely improved, reliable test suite which is regularly run by Github CI. Batik now also runs a CI but many tests are being excluded from those runs (and those that aren't excluded are not 100% reproducible and may not run reliably in the future). In contrast, EchoSVG's CI runs more tests and only excludes the flaky FontArabic and VerticalText.

There is still some pending work to be able to execute some of the tests on non-Windows platforms, but this project has gone a long way in test manageability and reproducibility.

Clone this wiki locally