Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra spaces popping up - caused by line breaks and XML indent #295

Open
ilovan opened this issue Jan 4, 2021 · 1 comment
Open

Extra spaces popping up - caused by line breaks and XML indent #295

ilovan opened this issue Jan 4, 2021 · 1 comment
Assignees

Comments

@ilovan
Copy link
Contributor

ilovan commented Jan 4, 2021

see as example: https://cwrc.ca/islandora/object/orlando%3Ajohnp2

transformed using xslt property indent="no": https://cwrc.ca/islandora/object/cwrc%3Aa8541aa9-8aaa-43d2-a4fd-c81ee724ff80

seems to fix most of the extra spaces popping up

@lucaju
Copy link
Contributor

lucaju commented Jan 4, 2021

It is related to empty spaceS, but not necessarily “indentation”. The problem seems to be with line breaking. Depending on where the line is a break (in order to format the XML thus adding indentation), a new node (type text) is appended to the element.

Removing just indentation doesn't solve the problem.

<CHRONPROSE>
                        <NAME STANDARD="Johnson, Pauline" REF="https://commons.cwrc.ca/orlando:7a7be083-1459-4879-aee3-7b6fecbed81f">PJ</NAME> was born at the family home, <PLACE>
                           <PLACENAME>Chiefswood</PLACENAME>, on the <REGION REG="Ontario">Six Nations Reserve</REGION>
                           <GEOG REG="Canada"/>
                        </PLACE> near <PLACE>
                           <SETTLEMENT>Brantford</SETTLEMENT>, <REGION>Ontario</REGION>
<GEOG REG="Canada"/>
</PLACE>.</CHRONPROSE> 

The problem is solved when the line breaking is removed with the indentation.

<CHRONSTRUCT RELEVANCE="SELECTIVE" CHRONCOLUMN="WRITINGCLIMATE" RESP="JSC">
                     <DATE VALUE="1861-03-10">10 March 1861</DATE>
                     <CHRONPROSE>
                        <NAME STANDARD="Johnson, Pauline" REF="https://commons.cwrc.ca/orlando:7a7be083-1459-4879-aee3-7b6fecbed81f">PJ</NAME> was born at the family home, <PLACE>
                           <PLACENAME>Chiefswood</PLACENAME>, on the <REGION REG="Ontario">Six Nations Reserve</REGION>
                           <GEOG REG="Canada"/>
                        </PLACE> near <PLACE>
                           <SETTLEMENT>Brantford</SETTLEMENT>, <REGION>Ontario</REGION><GEOG REG="Canada"/></PLACE>.
                    </CHRONPROSE>

Note that the “.” is on the same line as and now.

Since we don’t know where the line breakings are, the only way is to do this for sure is to remove all the (empty?) spaces. That is, having the XML as a single string. Not sure if this is a good idea, though.

@lucaju lucaju self-assigned this Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants