Skip to content

Latest commit

 

History

History
59 lines (44 loc) · 2.27 KB

README.org

File metadata and controls

59 lines (44 loc) · 2.27 KB

rewrite-louis

An experimental parser for liblouis tables based on instaparse. Given a liblouis table the parser returns plain old Clojure data. This can be used for example to rewrite the table into any other format.

Usage

(require '[instaparse.core :as insta])
(def louis-parser (insta/parser (clojure.java.io/resource "louis.bnf")))
(louis-parser (slurp "~/src/liblouis/tables/ar-ar-comp8.utb"))

which will return

[:table
 [:comment "afr#1#Afrikaans Uncontracted#za#Afrikaans onverkort"]
 ,,,
 [:comment " <http://www.gnu.org/licenses/>."]
 [:include "en-ueb-g1.ctb"]]

Command line

find ~/src/liblouis/tables -type f -print | grep -v -e '\.dic' -e 'Makefile' -e 'maketablelist.sh' -e 'README' | sort | xargs lein run

Status

As it stands the parser can parse probably around 95% of the tables in the liblouis distribution. At the moment it has no support for

  • continuation lines (da-dk-g16-lit.ctb, da-dk-g26-lit.ctb, da-dk-g26l-lit.ctb)
  • huge tables cause a OutOfMemoryError (GC overhead limit exceeded) (zh-chn.ctb, zhcn-g1.ctb, zhcn-g2.ctb, ko-chars.cti, zh-tw.ctb, etc)

Acknowledgements

A lot of the EBNF grammar was basically re-used from louis-parser and its liblouis table grammar definition in the form of Parsing expression grammar.

License

Copyright © 2021 Swiss Library for the Blind, Visually Impaired and Print Disabled

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.