DBpedia-Entity is a standard test collection for entity search over the DBpedia knowledge base. It is meant for evaluating retrieval systems that return a ranked list of entities (DBpedia URIs) in response to a free text user query.
The first version of the collection (DBpedia-Entity v1) was released in 2013, based on DBpedia v3.7 [1]. It was created by assembling search queries from a number of entity-oriented benchmarking campaigns and mapping relevant results to DBpedia. An updated version of the collection, DBpedia-Entity v2, has been released in 2017, as a result of a collaborative effort between the IAI group of the University of Stavanger, the Norwegian University of Science and Technology, Wayne State University, and Carnegie Mellon University [2]. It has been published at the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'17), where it received a Best Short Paper Honorable Mention Award. See the paper and poster.
The test collection is based on DBpedia version 2015-10, specifically on the English subset.
We require entities to have both a title and abstract (i.e., rdfs:label
and rdfs:comment
predicates)--this effectively filters out category, redirect, and disambiguation pages. Note that list pages, on the other hand, are retained. In the end, there are 4.6 million entities, each uniquely identified by its URI. We use a simplified prefixed format: http://dbpedia.org/resource/Albert_Einstein
=> <dbpedia:Albert_Einstein>
.
The collection consists of a set of heterogeneous entity-bearing queries, assembled from various benchmarking campaigns (see the paper for details). Queries are categorized into four groups:
Category | Description | Examples |
---|---|---|
SemSearch_ES |
Named entity queries | "brooklyn bridge", "08 toyota tundra" |
INEX-LD |
IR-style keyword queries | "electronic music genres" |
QALD2 |
Natural language questions | "Who is the mayor of Berlin?" |
ListSearch |
Queries that seek a particular list of entities | "Professional sports teams in Philadelphia" |
All queries are prefixed with the name of the originating benchmark. SemSearch_ES
, INEX-LD
, and QALD2
each correspond to a separate category; the rest of the queries belong to the ListSearch
category.
Relevance judgments are collected using crowdsourcing. To ensure high quality, we obtained further expert annotations for cases with substantial disagreement. In total, over 49K query-entity pairs are labeled using a three-point scale (0: irrelevant, 1: relevant, and 2: highly relevant).
The DBpedia-Entity v2 collection can be found under collection/v2
and is organized as follows:
queries-v2.txt
: The set of 467 queries, where each line contains a query ID and query text pair.queries-v2_stopped.txt
: The same queries, with stop patterns and punctuation marks removed.qrels-v2.txt
: Relevance judgments in standard TREC format.folds/
: Partitioning of queries for 5-fold cross validation. This is provided to make results directly comparable by using the same partitioning for supervised approaches. A separate file is provided for each query subset; if training is done over the set of all queries, use theall_queries.json
file.annotator_agreements.tsv
: Inter-annotator agreements between crowd workers (and expert annotators, if applicable) for each query-entity pair. The agreement scores are computed according to the Fleiss' kappa index (i.e., Eq (3) of its Wikipedia article). This information may be used as a proxy for query difficulty.
This repository also contains the DBpedia-Entity v1 collection, which was built based on DBpedia version 3.7. The collection can be found under collection/v1
and is organized similar to the v2 version. There are, however, 3 qrels file for DBpedia-Entity v1:
qrels-v1_37.txt
: The original qrels, based on DBpedia 3.7.qrels-v1_39.txt
: Qrels with updated entity IDs according to DBpedia 3.9.qrels-v1_2015_10.txt
: Qrels with updated entity IDs according to DBpedia 2015-10.
The runs
folder contains a set of baseline rankings ("runs") in TREC format:
/v1
: The runs related to DBpedia-Entity v1, reported in Table 2 of the paper [2]./v2
: The runs related to DBpedia-Entity v2, reported in the table below. The evaluation metric is NDCG (Normalized Discounted Cumulative Gain) at ranks 10 and 100. New retrieval systems, evaluated using DBpedia-Entity v2, are supposed to be compared against these results. Note that all these runs are generated using thequeries-v2_stopped.txt
query file.
Model | SemSearch ES | INEX-LD | ListSearch | QALD-2 | Total | |||||
---|---|---|---|---|---|---|---|---|---|---|
@10 | @100 | @10 | @100 | @10 | @100 | @10 | @100 | @10 | @100 | |
BM25 | 0.2497 | 0.4110 | 0.2770 | 0.3612 | 0.2199 | 0.3302 | 0.2751 | 0.3366 | 0.2558 | 0.3582 |
PRMS | 0.5340 | 0.6108 | 0.3590 | 0.4295 | 0.3684 | 0.4436 | 0.3151 | 0.4026 | 0.3905 | 0.4688 |
MLM-all | 0.5528 | 0.6247 | 0.3752 | 0.4493 | 0.3712 | 0.4577 | 0.3249 | 0.4208 | 0.4021 | 0.4852 |
LM | 0.5555 | 0.6475 | 0.3999 | 0.4745 | 0.3925 | 0.4723 | 0.3412 | 0.4338 | 0.4182 | 0.5036 |
SDM | 0.5535 | 0.6672 | 0.4030 | 0.4911 | 0.3961 | 0.4900 | 0.3390 | 0.4274 | 0.4185 | 0.5143 |
LM+ELR | 0.5554 | 0.6469 | 0.4040 | 0.4816 | 0.3992 | 0.4845 | 0.3491 | 0.4383 | 0.4230 | 0.5093 |
SDM+ELR | 0.5548 | 0.6680 | 0.4104 | 0.4988 | 0.4123 | 0.4992 | 0.3446 | 0.4363 | 0.4261 | 0.5211 |
MLM-CA | 0.6247 | 0.6854 | 0.4029 | 0.4796 | 0.4021 | 0.4786 | 0.3365 | 0.4301 | 0.4365 | 0.5143 |
BM25-CA | 0.5858 | 0.6883 | 0.4120 | 0.5050 | 0.4220 | 0.5142 | 0.3566 | 0.4426 | 0.4399 | 0.5329 |
FSDM | 0.6521 | 0.7220 | 0.4214 | 0.5043 | 0.4196 | 0.4952 | 0.3401 | 0.4358 | 0.4524 | 0.5342 |
BM25F-CA | 0.6281 | 0.7200 | 0.4394 | 0.5296 | 0.4252 | 0.5106 | 0.3689 | 0.4614 | 0.4605 | 0.5505 |
FSDM+ELR | 0.6563 | 0.7257 | 0.4354 | 0.5134 | 0.4220 | 0.4985 | 0.3468 | 0.4456 | 0.4590 | 0.5408 |
-
DBpedia-Entity-CAR: DBpedia-Entity v2 collection projected onto the Wikipedia dump used in TREC Complex Answer Retrieval v2.1, a contribution from TREMA Lab at University of New Hampshire.
-
Entity summarization: 100 query-entity pairs (evenly selected from the four query subsets), their corresponding entity facts, and their graded judgments with respect to importance, relevance, and utility.
-
Target type identification: DBpedia-entity queries annotated with target query types using the DBpedia ontology.
If you are using this collection, please cite the following paper:
@inproceedings{Hasibi:2017:DVT, author = {Hasibi, Faegheh and Nikolaev, Fedor and Xiong, Chenyan and Balog, Krisztian and Bratsberg, Svein Erik and Kotov, Alexander and Callan, Jamie}, title = {DBpedia-Entity V2: A Test Collection for Entity Search}, booktitle = {Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval}, series = {SIGIR '17}, year = {2017}, pages = {1265--1268}, doi = {10.1145/3077136.3080751}, publisher = {ACM} }
If possible, please also include the http://tiny.cc/dbpedia-entity URL in your paper.
This research was partially supported by the Norwegian Research Council, National Science Foundation (NSF) grant IIS-1422676, Google Faculty Research Award, and Allen Institute for Artificial Intelligence Student Fellowship. We thank Saeid Balaneshin, Jan R. Benetka, Heng Ding, Dario Garigliotti, Mehedi Hasan, Indira Kurmantayeva, and Shuo Zhang for their help with creating relevance judgements.
In case of questions, feel free to contact [email protected] or [email protected].
[1] Krisztian Balog and Robert Neumayer. 2013. "A Test Collection for Entity Search in DBpedia", In proceedings of 436th international ACM SIGIR conference on Research and development in Information Retrieval (SIGIR ’13). 737-740.
[2] Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan. 2017. “DBpedia-Entity v2: A Test Collection for Entity Search”, In proceedings of 40th ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR ’17). 1265-1268.