-
Notifications
You must be signed in to change notification settings - Fork 25
How do I create a new Qanary component?
To illustrate rapid engineering of QA Systems using the Qanary architecture, we provide a demo of the creation of a component for identifying a relation present in an input question. For instance, in the question What is the real name of Batman, the component identifies foaf:name
as a relation, i.e., a DBpedia property. The component can be integrated into the Qanary ecosystem by wrapping the functionality into a Qanary component and to create a Qanary QA system following this step-by-step procedure:
Check out the Qanary project from GitHub:
git clone https://github.com/WDAqua/Qanary
In the project root folder (i.e., after checking out the GitHub project it would be the folder Qanary
) build and install the core components of the Qanary framework.
mvn install -Dgpg.skip
Note: To skip the Docker image creation you can use mvn -Ddockerfile.skip -Dgpg.skip install
.^1
-
To simplify the process of implementing new components the project includes a Maven archetype
qa.qanarycomponent-archetype
in sub-folderqanary_component-archetype
for creating a new Qanary component in your local computer. -
To create a new Qanary component, execute the following command in the project root folder:
mvn archetype:generate \
-DarchetypeGroupId=eu.wdaqua.qanary.component \
-DarchetypeArtifactId=qa.qanarycomponent-archetype \
-DarchetypeVersion=1.1.0 \
-DgroupId=eu.wdaqua.qanary.commons \
-DartifactId=qa.qanary_component-MyComponent \
-Dversion=0.0.1 \
-Dpackage=eu.wdaqua.qanary.realnamerelationidentifier \
-Dclassname=RealNameRelationIdentifier \
-DinteractiveMode=false
Note: You might have to adjust the used Maven archetype version (c.f., -DarchetypeVersion
). See the current version statement of the Qanary component archetype.
- The previous step creates the component named
qanary_component-realnamerelationidentifier
in folderqa.qanary_component-Mycomponent
. - Open the newly created Maven project in your IDE. The new component automatically appears in the list:
Note: This screenshot shows the representation in IntelliJ. It will work similarly in any other IDE.
- Learn more about the configuration of your component here.
- The component is already completely runnable.
- Next you will want to implement the component body.
/**
* implement this method encapsulating the functionality of your Qanary
* component, some helping notes w.r.t. the typical 3 steps of implementing a
* Qanary component are included in the method (you might remove all of them)
*
* @throws SparqlQueryFailed
*/
@Override
public QanaryMessage process(QanaryMessage myQanaryMessage) throws Exception {
logger.info("process: {}", myQanaryMessage);
QanaryUtils myQanaryUtils = this.getUtils(myQanaryMessage);
- Retrieve the textual representation of the question:
QanaryQuestion<String> myQanaryQuestion = new QanaryQuestion<String>(myQanaryMessage);
- Use a simple string matcher to identify the sub-string “real name of” in the textual question (mapped to
foaf:name
) and identify the start (START) and end (END) position of this substring in the question. - Store this information within the triplestore using a SPARQL INSERT query. The text index of the matcher is used as well as the DBpedia property
foaf:name
.
// push data to the Qanary triplestore
String sparqlUpdateQuery = ""
+ "prefix qa: <http://www.wdaqua.eu/qa#> "
+ "prefix oa: <http://www.w3.org/ns/openannotation/core/> "
+ "prefix xsd: <http://www.w3.org/2001/XMLSchema#> "
+ "prefix dbp: <http://dbpedia.org/property/> "
+ "PREFIX foaf: <http://xmlns.com/foaf/0.1/>"
+ "INSERT { "
+ "GRAPH <" + myQanaryUtils.getOutGraph() + "> { "
+ " ?a a qa:AnnotationOfRelation . "
+ " ?a oa:hasTarget [ "
+ " a oa:SpecificResource; "
+ " oa:hasSource <" + myQanaryQuestion.getUri() + ">; "
+ " oa:start \"" + START + "\"^^xsd:nonNegativeInteger ; " //
+ " oa:end \"" + END + "\"^^xsd:nonNegativeInteger " //
+ " ] ; "
+ " oa:hasBody foaf:name ;"
+ " oa:annotatedBy <http://relationidentifier.com> ; "
+ " oa:AnnotatedAt ?time "
+ "}} "
+ "WHERE { "
+ " BIND (IRI(str(RAND())) AS ?a) ."
+ " BIND (now() as ?time) "
+ "}";
logger.info("SPARQL query {}", sparqlUpdateQuery);
myQanaryUtils.updateTripleStore(sparqlUpdateQuery, myQanaryMessage.getEndpoint());
-
Build the component again after you have made changes to the implementation.
mvn package
-
Now, the Qanary component provides the demanded functionality.
Next, initialize a service composition in form of a specific Qanary pipeline containing just the created component:
- Execute the reference pipeline implementation (
java -jar Qanary/qanary_pipeline-template/target/qanary_qa.pipeline-X.Y.Z.jar
). - Create a package from your source code and start the new component, e.g., by execting:
java -jar qanary_component-Realnamerelationidentifier-0.0.1.jar
- Open the Web frontend of the included Spring Boot AdminServer (managing the available Qanary components) provided by the Qanary pipeline in at http://localhost:8080 (if you use the default configuratuib). There your component is shown now:
- Go to http://localhost:8080/startquestionansweringwithtextquestion to start a sub-workflow of the whole process using only the created component.
- To ensure the annotation was stored correctly, query your triplestore for this pipeline. You might use the following SPARQL query to do so:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?a ?question ?start ?end ?time
FROM <$graphURL$>
WHERE {
VALUES ?question {
<$questionURL$>
} .
?a a qa:AnnotationOfRelation .
?a oa:hasTarget [
a oa:SpecificResource;
oa:hasSource ?question;
oa:start ?start ;
oa:end ?end
] .
oa:hasBody foaf:name .
oa:annotatedBy <http://relationidentifier.com> .
oa:AnnotatedAt ?time
}
Note: Use the question analysis outgraph and question URI from the result table.
When creating a new component, you have several meta information that can be set or edited. This information is mainly inside the component's pom.xml file (e.g. <developer></developer>
) or the application.properties file (e.g. spring.application.description
). In the first case, some properties such as the developer, license, URL, description, and name are inherited by the parent artifact. As these probably aren't correct for your component, it's recommended to specify them. You can look up the pre-defined values here.
Clearly the created Qanary system isn't a complete Question Answering system. We just create one specific task and wrapped it into a Qanary pipeline. Now, you should think about creating next Qanary components (or reusing existing ones) which lead to a functional Question Answering system.
The workflow above can be iteratively expanded to a complete QA system, for example, by adding a SPARQL query builder and executor. Due to the characteristics of the approach not only the mentioned question can be answered, but at least 700 other (based on the DBpedia KB).
This concludes the integration of a newly developed component into Qanary following easy hassle-free integration steps.
-
How to establish a Docker-based Qanary Question Answering system
-
How to implement a new Qanary component
... using Java?
... using Python (Qanary Helpers)?
... using Python (plain Flask service)?