Skip to content

How do I create a new Qanary component?

p-Heinze edited this page May 15, 2020 · 1 revision

A step-by-step procedure for integrating a new component into Qanary architecture.

To illustrate rapid engineering of QA Systems using the Qanary architecture, we provide a demo of the creation of a component for identifying a relation present in an input question. For instance, in the question What is the real name of Batman, the component identifies foaf:name as a relation, i.e., a DBpedia property. The component can be integrated into the Qanary ecosystem by wrapping the functionality into a Qanary component and to create a Qanary QA system following this step-by-step procedure:

Step 1

Check out the Qanary project from GitHub:

git clone https://github.com/WDAqua/Qanary

Step 2

In the project root folder (i.e., after checking out the GitHub project it would be the folder Qanary) build and install the core components of the Qanary framework.

mvn install

Note: To skip the Docker image creation you can use mvn -Ddockerfile.skip install.

  • To simplify the process of implementing new components the project includes a Maven archetype qa.qanarycomponent-archetype in sub-folder qanary_component-archetype for creating a new Qanary component in your local computer.

  • To create a new Qanary component, execute the following command in the project root folder:

mvn archetype:generate \
       -DarchetypeGroupId=eu.wdaqua.qanary.component \
       -DarchetypeArtifactId=qa.qanarycomponent-archetype \
       -DarchetypeVersion=1.1.0 \
       -DgroupId=eu.wdaqua.qanary.commons \
       -DartifactId=qa.qanary_component-MyComponent \
       -Dversion=0.0.1 \
       -Dpackage=eu.wdaqua.qanary.realnamerelationidentifier \
       -Dclassname=RealNameRelationIdentifier \
       -DinteractiveMode=false

Note: You might have to adjust the used Maven archetype version (c.f., -DarchetypeVersion). See the current version statement of the Qanary component archetype.

  • The previous step creates the component named qanary_component-realnamerelationidentifier in folder qa.qanary_component-Mycomponent.
  • Open the newly created Maven project in your IDE. The new component automatically appears in the list:

project-structure

Note: This screenshot shows the representation in IntelliJ. It will work similarly in any other IDE.

  • Learn more about the configuration of your component here.

Step 3

  • The component is already completely runnable.
  • Next you will want to implement the component body.
	/**
	 * implement this method encapsulating the functionality of your Qanary
	 * component, some helping notes w.r.t. the typical 3 steps of implementing a
	 * Qanary component are included in the method (you might remove all of them)
	 * 
	 * @throws SparqlQueryFailed
	 */
	@Override
	public QanaryMessage process(QanaryMessage myQanaryMessage) throws Exception {
		logger.info("process: {}", myQanaryMessage);

		QanaryUtils myQanaryUtils = this.getUtils(myQanaryMessage);
  • Retrieve the textual representation of the question:
		QanaryQuestion<String> myQanaryQuestion = new QanaryQuestion<String>(myQanaryMessage);
  • Use a simple string matcher to identify the sub-string “real name of” in the textual question (mapped to foaf:name) and identify the start (START) and end (END) position of this substring in the question.
  • Store this information within the triplestore using a SPARQL INSERT query. The text index of the matcher is used as well as the DBpedia property foaf:name.
		// push data to the Qanary triplestore
		String sparqlUpdateQuery = ""
				+ "prefix qa: <http://www.wdaqua.eu/qa#> "
				+ "prefix oa: <http://www.w3.org/ns/openannotation/core/> "
				+ "prefix xsd: <http://www.w3.org/2001/XMLSchema#> "
				+ "prefix dbp: <http://dbpedia.org/property/> "
				+ "PREFIX foaf: <http://xmlns.com/foaf/0.1/>"
				+ "INSERT { "
				+ "GRAPH <" + myQanaryUtils.getOutGraph() + "> { "
				+ "  ?a a qa:AnnotationOfRelation . "
				+ "  ?a oa:hasTarget [ "
				+ "           a    oa:SpecificResource; "
				+ "           oa:hasSource    <" + myQanaryQuestion.getUri() + ">; "
				+ "              oa:start \"" + START + "\"^^xsd:nonNegativeInteger ; " //
				+ "              oa:end  \"" + END + "\"^^xsd:nonNegativeInteger  " //
				+ "  ] ; "
				+ "     oa:hasBody foaf:name ;"
				+ "     oa:annotatedBy <http://relationidentifier.com> ; "
				+ "	    oa:AnnotatedAt ?time  "
				+ "}} "
				+ "WHERE { "
				+ "  BIND (IRI(str(RAND())) AS ?a) ."
				+ "  BIND (now() as ?time) "
				+ "}";
		logger.info("SPARQL query {}", sparqlUpdateQuery);
		myQanaryUtils.updateTripleStore(sparqlUpdateQuery, myQanaryMessage.getEndpoint());
  • Build the component again after you have made changes to the implementation. mvn package

  • Now, the Qanary component provides the demanded functionality.

Step 4

Next, initialize a service composition in form of a specific Qanary pipeline containing just the created component:

  1. Execute the reference pipeline implementation (java -jar Qanary/qanary_pipeline-template/target/qanary_qa.pipeline-X.Y.Z.jar).
  2. Create a package from your source code and start the new component, e.g., by execting: java -jar qanary_component-Realnamerelationidentifier-0.0.1.jar
  3. Open the Web frontend of the included Spring Boot AdminServer (managing the available Qanary components) provided by the Qanary pipeline in at http://localhost:8080 (if you use the default configuratuib). There your component is shown now:

localhost8080_single-component

  1. Go to http://localhost:8080/startquestionansweringwithtextquestion to start a sub-workflow of the whole process using only the created component.

frontend_single-component

  1. To ensure the annotation was stored correctly, query your triplestore for this pipeline. You might use the following SPARQL query to do so:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?a ?question ?start ?end ?time
FROM <$graphURL$> 
WHERE {
   VALUES ?question {
      <$questionURL$>
   } .
   ?a a qa:AnnotationOfRelation .
   ?a oa:hasTarget [
        a    oa:SpecificResource;            
        oa:hasSource ?question;
        oa:start ?start ;
        oa:end  ?end    
    ] .
    oa:hasBody foaf:name .
    oa:annotatedBy <http://relationidentifier.com> .
    oa:AnnotatedAt ?time
}

Note: Use the question analysis outgraph and question URI from the result table.

Conclusion

Clearly the created Qanary system isn't a complete Question Answering system. We just create one specific task and wrapped it into a Qanary pipeline. Now, you should think about creating next Qanary components (or reusing existing ones) which lead to a functional Question Answering system.

The workflow above can be iteratively expanded to a complete QA system, for example, by adding a SPARQL query builder and executor. Due to the characteristics of the approach not only the mentioned question can be answered, but at least 700 other (based on the DBpedia KB).

This concludes the integration of a newly developed component into Qanary following easy hassle-free integration steps.

Clone this wiki locally