TAT-DQA is a large-scale Document VQA dataset, which is constructed by extending the TAT-QA. It aims to stimulate progress of QA research over more complex and realistic visually-rich documents with rich tabular and textual content, especially those requiring numerical reasoning.
You can download our TAT-DQA dataset via TAT-DQA Dataset.
For more information, please refer to our TAT-DQA Website or read our ACM MM 2022 paper PDF.
Please kindly cite our work if you use our dataset or codes, thank you.
@inproceedings{zhu2022towards,
title={Towards complex document understanding by discrete reasoning},
author={Zhu, Fengbin and Lei, Wenqiang and Feng, Fuli and Wang, Chao and Zhang, Haozhou and Chua, Tat-Seng},
booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
pages={4857--4866},
year={2022}
}
@inproceedings{zhu2024doc2soargraph,
title = "{D}oc2{S}oar{G}raph: Discrete Reasoning over Visually-Rich Table-Text Documents via Semantic-Oriented Hierarchical Graphs",
author = "Zhu, Fengbin and
Wang, Chao and
Feng, Fuli and
Ren, Zifeng and
Li, Moxin and
Chua, Tat-Seng",
editor = "Calzolari, Nicoletta and
Kan, Min-Yen and
Hoste, Veronique and
Lenci, Alessandro and
Sakti, Sakriani and
Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
year = "2024",
address = "Torino, Italia",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.456",
pages = "5119--5131"
}
The TAT-DQA dataset is under the license of Creative Commons (CC BY) Attribution 4.0 International
For any issues please create an issue here or kindly email us at: Fengbin Zhu [email protected], thank you.