title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Can Visual Scratchpads With Diagrammatic Abstractions Augment LLM Reasoning?

When humans reason about complex text-based questions, we leverage diagrammatic abstractions drawn on a visual scratchpad. In this paper, we introduce and explore the capabilities of Visual-Scratchpad, a method that augments a large language foundation model (LLM) with diagrammatic execution and readout. We enable the LLM to generate drawing commands and to readout abstractions from the resulting picture. The visual readout operation uses a visual foundation model, optionally finetuned with expert iteration. Here, we show that although Visual-Scratchpad outperforms an inference-only LLM, it surprisingly yields worse performance compared to a single finetuned LLM. Through experiments, we propose that this gap is due to the failure mode of vision foundation models in understanding abstractions in diagrams.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

hsu23a

0

Can Visual Scratchpads With Diagrammatic Abstractions Augment LLM Reasoning?

21

28

21-28

21

false

Hsu, Joy and Poesia, Gabriel and Wu, Jiajun and Goodman, Noah

given	family
Joy	Hsu

given	family
Gabriel	Poesia

given	family
Jiajun	Wu

given	family
Noah	Goodman

2023-04-24

Proceedings on "I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops

239

inproceedings

date-parts

2023

4

24

https://proceedings.mlr.press/v239/hsu23a/hsu23a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2023-04-24-hsu23a.md

2023-04-24-hsu23a.md

Files

2023-04-24-hsu23a.md

Latest commit

History

2023-04-24-hsu23a.md

File metadata and controls