-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.txt
51 lines (36 loc) · 2.22 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
See ~/Documents/tapas/profiling/README.txt for some comments on the
null-terminated record handling.
Since I don't currently use `xmlsh`, or an ant task, or eXist, or an
oXygen mutli-file scenario, or some other mechanism of running `saxon`
multiple times with one load of the Java VM, it is orders of magnitude
faster to run this on a single large directory of input files rather
than on a series of individual files. So, to test it on *all* the test
input I have
1) Create a single flat directory with the XML files
2) Run `saxon` on the entire dir.
Note that the output directory goes in this directory so that relative
pointers to CSS and JS work. (No need to check-in that output dir,
though.)
--------- (1) ---------
$ cd ~/Documents/tapas-generic/
$ rm -fr /tmp/IN/; mkdir /tmp/IN/
$ rm -fr OUT/*
Then, to generate a single directory with all the in vivo XML files w/o
hierarchy:
$ time find ~/Documents/tapas/profiling/data -name '*.xml' -print0 | egrep -Zz '/.*/' | egrep -Zzv 'tei-xsl' | while read -d $'\0' f; do g=`echo "$f" | perl -pe 's, ,_,g;'`; cp "$f" /tmp/IN/`basename $g` ; done
Note that this currently works because the one and only duplicate name
(alexander.xml) is also a duplicate file. :-)
Or, to generate a single directory with all the in vitro XML files:
$ cp -pr ~/Documents/tapas/rendering/data/test0*.xml /tmp/IN/
$ cp -pr ~/Documents/tapas/rendering/data/test06_page_images /tmp/IN/
--------- (2) ---------
$ cd ~/Documents/tapas-generic/
$ time saxon -xsl:tei2html.xslt -s:/tmp/IN/ -o:OUT/ fullHTML=true
$ # either change names in Emacs, or use:
$ for f in OUT/*.xml ; do mv $f `dirname $f`/`basename $f .xml`.xhtml ; done
---------
For previous XSLT 1.0 system (i.e., pre 2015-10-12), use the following
to generate complete HTML output of all profiling and data.
$ cd ~/Documents/tapas-xq/resources/tapas-generic
$ time find ~/Documents/tapas/profiling/data -name '*.xml' -print0 | egrep -Zz '/.*/' | egrep -Zzv 'tei-xsl' | while read -d $'\0' f; do echo "---------$f:" ; OUT=OUT/`basename "${f}" .xml`.html; time ( xsltproc tei2html_1.xsl "${f}" > /tmp/TMP.xml ; xsltproc --stringparam fullHTML true tei2html_2.xsl /tmp/TMP.xml > "$OUT" ) ; done
Takes about 7.8 mins from within emacs on my home machine.