TermIt is a SKOS-compliant terminology management tool based on Semantic Web technologies. It allows to manage vocabularies consisting of thesauri and ontologies. It can also manage documents which use terms from the vocabularies and analyze the documents to find occurrences of these terms.
An asset is an object of one of the main domain types managed by the system - Resource, Term or Vocabulary.
- JDK 8 or newer
- Apache Maven 3.6.x or newer
The system is split into two projects, TermIt is the backend, TermIt-UI represents the frontend. Both projects are built separately and can run separately.
This section briefly lists the main technologies and principles used (or planned to be used) in the application.
- Spring Boot 2, Spring Framework 5, Spring Security, Spring Data (paging, filtering)
- Jackson 2.13
- JB4JSON-LD*
- JOPA
- JUnit 5* (RT used 4), Mockito 3* (RT used 1), Hamcrest 2* (RT used 1)
- Servlet API 4* (RT used 3.0.1)
- JSON Web Tokens* (CSRF protection not necessary for JWT)
- SLF4J + Logback
- CORS* (for separate frontend)
- Java bean validation (JSR 380)*
* Technology not used in INBAS RT
Had to switch from standard Java DOM implementation to jsoup because DOM had sometimes trouble parsing HTML documents (meta
tags in header).
Jsoup, on the other hand, handles HTML natively and should be able to work with XML (if need be) as well.
The application uses JSR 380 validation API. This provides a generic, easy-to-use API for bean validation based on annotations.
Use it to verify input data. See User
and its validation in BaseRepositoryService
/UserRepositoryService
.
ValidationException
is then handled by RestExceptionHandler
and an appropriate response is returned to the client.
TermIt is preconfigured to run against a local GraphDB repository at http://locahost:7200/repositories/termit
.
This can be changed by updating application.yml
.
User
is a domain class used for domain functions, mostly for resource provenance (author, last editor). It does not support password.
UserAccount
is used for security-related functions and supports password. Most parts of the application should use
User
.
A JMX bean called AppAdminBean
was added to the application. Currently, it supports invalidation of application caches.
The bean's name is set during Maven build. In case multiple deployments of TermIt are running on the same application server,
it is necessary to provide different names for it. A Maven property with default value DEV was introduced for it. To specify
a different value, pass a command line parameter to Maven, so the build call might look as follows:
mvn clean package -B -P production "-Ddeployment=DEV"
Fulltext search currently supports multiple types of implementation:
- Simple substring matching on term and vocabulary label (default)
- RDF4J with Lucene SAIL
- GraphDB with Lucene connector
Each implementation has its own search query which is loaded and used by SearchDao
. In order for the more advanced implementations
for Lucene to work, a corresponding Maven profile (graphdb, rdf4j) has to be selected. This inserts the correct query into the resulting
artifact during build. If none of the profiles is selected, the default search is used.
Note that in case of GraphDB, corresponding Lucene connectors (label_index
for labels and defcom_index
for definitions and comments)
have to be created as well.
The test in-memory repository is configured to be a SPIN SAIL with RDFS inferencing engine. Thus, basically all the inference features available
in production are available in tests as well. However, the repository is by default left empty (without the model or SPIN rules) to facilitate test
performance (inference in RDF4J is really slow). To load the TermIt model into the repository and thus enable RDFS inference, call the enableRdfsInference
method available on both BaseDaoTestRunner
and BaseServiceTestRunner
. SPIN rules are currently not loaded as they don't seem to be used by any tests.
The ontology on which TermIt is based can be found in the ontology
folder. For proper inference functionality, termit-model.ttl
, the
popis-dat ontology model (http://onto.fel.cvut.cz/ontologies/slovnik/agendovy/popis-dat/model) and the SKOS vocabulary model
(http://www.w3.org/TR/skos-reference/skos.rdf) need to be loaded into the repository
used by TermIt.
We are using JavaMelody for monitoring the application and its usage. The data are available
on the /monitoring
endpoint and are secured using basic authentication. Credentials are configured using the javamelody.init-parameters.authorized-users
parameter in application.yml
(see the JavaMelody Spring Boot Starter docs).
TermIt REST API is documented on SwaggerHub under the appropriate version.
Build configuration and deployment is described in setup.md.
The docker image of TermIt backend can be built by
docker build -t termit-server .
Then, TermIt can be run and exposed at the port 8080 as
sudo docker run -e REPOSITORY_URL=<GRAPHDB_REPOSITORY_URL> -p 8080:8080 termit-server
An optional argument is <GRAPHDB_REPOSITORY_URL>
pointing to the RDF4J/GraphDB repository.
Licensed under GPL v3.0.