TAT3

This repository contains the source code for my master's thesis Large Language Model-Driven Data Enrichment.

Abstract

This thesis introduces the Table Augmentation via Text-to-Text Transfer (TAT3) framework that leverages foundational pretrained LLMs to perform subject suggestion. TAT3 suggests missing entities of an incomplete table that consists of rows representing entities. The system is inspired by recent advances in AI Assistants, specifically GitHub Copilot. With TAT3, we address the limitations of previous works that focus on subject suggestion having strict assumptions about the source of knowledge as well as the input and output format. TAT3 achieves similar results to established baselines on a standard subject suggestion benchmark. On the benchmark, a correct first suggestion is found for 73.0% of the queries, which is comparable to our best employed baseline from previous work that has stricter assumptions. For 89.1% of the queries, a correct suggestion is found within the first five suggestions. On average, 63.1% of the first five suggestions are correct. An offline optimization strategy increased TAT3's probability of finding a correct entity within the first suggestion by 6.05% relative to the unoptimized approach. Ensembling, used to address possible hallucinations of LLMs, increased the probability of finding a correct entity within the first suggestion by 3.95% relative to choosing the best LLM. Additionally, we demonstrate that TAT3 can adapt its output format of the results to the rows already present in a table.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
prompts		prompts
src/tat3		src/tat3
.gitignore		.gitignore
README.md		README.md
arbiter.py		arbiter.py
main.py		main.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TAT3

Abstract

About

Releases

Packages

Languages

ChristophScn/TAT3

Folders and files

Latest commit

History

Repository files navigation

TAT3

Abstract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages