Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix all whitespace specs #215

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

dpoetzsch
Copy link
Contributor

This work builds upon #212 and finishes whitespace handling for all other spec cases.

All fixes are only in the parser. I took special care that the changes only take place if the "whitespace" flag is set. Otherwise the parser behaves exactly as before.

@spullara
Copy link
Owner

A few comments (thanks for digging into this):

  1. Do you think it makes sense to pass the indent around in the TemplateContext so it doesn't change the callback interface?

  2. It looks like the indent writer will also indent if a value contains a \n rather than just the template containing one. Is that intended? If not, it looks like we could remember where the \n are and put special handling instead in appendText that wouldn't have to reparse the output.

  3. What is the nature of the other spec tests failing?

@dpoetzsch
Copy link
Contributor Author

dpoetzsch commented Sep 19, 2018

  1. I did not really grasp what the TemplateContext is used for. The indent is as far as I know only used for partials so passing it to the other parts seems to be not really necessary. But if you think this is a better place for the indent I can put it there as well, no problem.

  2. I'm not exactly sure what you mean. If you mean the case Standalone Indentation where the value of the variable content should not be indented, this is handled. I do this by stripping the IndentWriter in the ValueCode.

    Probably one could mark all newlines in a special way when creating the mustache AST. Then, during rendering, one could treat the newlines in a special way (i.e. by rendering them with indentation if needed). However, then either all the ___Code classes need to be adjusted to support indented rendering, or we can no longer use a standard writer but need a special-newline-aware writer thingy. To me this solution seemed much more complicated, especially with backwards compatibility in mind. This is why I decided against it (at least for now).

  3. The whitespace problems of the other tags stemmed from the fact, how standalone tags (i.e. tags that are followed by a newline and are only preceeded by whitespace in their line) are handled. According to the spec, if partials, delimiter change or sections occur as standalone tags they should be completely stripped, i.e. both the preceeding whitespace and the following newline are dropped during parsing. This is what I implemented.

@spullara
Copy link
Owner

I'm starting to warmup to just making this the default and removing the old whitespace behavior as it appears to be close enough in running all my current tests. I'd still like to avoid allocating at runtime - currently there are essentially no objects allocated while running a template - by optimizing the way indentation is done. Maybe through remembering where the '\n's are while we are doing the initial compile and inserting the indentation in the appendText call. I haven't looked at it closely enough to see if that is feasible. Also it appears that the limited recursion stuff isn't working right now but haven't tracked down why - probably because the indent writer replacing it.

@dpoetzsch
Copy link
Contributor Author

Alright, I will create an alternative implementation with \n handling at compile time.

Should I then remove whitespace backwards compatibility from the parser or only disable it by default?

What do you mean by "limited recursion stuff"?

@spullara
Copy link
Owner

I think we can remove it. The differences are pretty small and I don't think it will be disruptive except perhaps for tests which can be pretty easily updated.

Right now the compiler notices if your partials are recursive, i.e. a partial eventually includes itself, and when it sees that it uses a DepthLimitedWriter at runtime in order to ensure that we don't blow up the stack. With the IndentWriter that behavior was removed. If we move to compile time for that stuff we should be able to get away without a writer though it is tricky. If you would rather I work on it, I can.

@dpoetzsch
Copy link
Contributor Author

dpoetzsch commented Sep 24, 2018

I had the time to think about the allocation problem for a while now.

Actually passing the indent as char[] through all the execute methods down to appendText does not work. I have an almost working implementation but the final new line of partials break it (if you're interested in the implementation: https://github.com/dpoetzsch/mustache.java/tree/fix-all-whitespace-performant). This is because when rendering the final newline of a partial we do not know how much indentation has to follow. For example:

  - name: Standalone Indentation Nested
    desc: Each line of the partial should be indented before rendering.
    data: { content: "<\n-->" }
    template: |
      \
       {{> p1}}
      /
    partials:
      p1: |
        |
         {{> p2}}
        |
      p2: |
        :
        {{{content}}}
        :
    expected: |
      \
       |
        :
        <
      -->
        :
       |
      /

After the final newline of p2 we render a indentation of one space because p1 is indented with one space. After the final newline of p1 we must not render any indentation.

The indentation can only be rendered when we render the next text, so we must keep in mind that the output is at the start of a line and keep a stack of indentations with each partial.

This logic is basically what is already implemented with the IndentWriter.

For the whitespace-fixed version we currently need one allocation per partial call: The allocation of the IndentWriter.

The only other way I see is to have an implementation where we keep a stack of indents (e.g. an Array List) that we build up and take down during rendering. However this will make wrapping the the writer very complicated (e.g. for LatchedWriter) and breaks any chance for parallelization.

Another problem you mentioned was the reparsing of the output for new lines during rendering. I created #216 to avoid this problem. However I have to say I am not really happy with it because of the new complications described there.

Summing up, I'd be happy to see a better solution by you, because after trying out a lot of approaches I don't feel I can improve mine anymore :-(

@spullara
Copy link
Owner

Ran into the same problem when I tried to convert your patch. Thinking about it some more as it is really close.

@dpoetzsch
Copy link
Contributor Author

It's been almost a month now. Is there still any progress on this? If not it would be great to have at least the slow version of whitespace-correct mustache available so we can use it in our project (it does not affect the original implementation anyway).

If this is not desired please tell me, then I'll create a fork and use that instead.

@spullara
Copy link
Owner

Can you fork then for now? The issue is that if I merge this in I'll have to support this form of it going forward and I would rather have it natively support whitespace spec compliance. Just haven't had time to work on it and figure out how to handle the case we both ran into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants