ShowUI

ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Weixian Lei, Lijuan Wang, Mike Zheng Shou
Show Lab @ National University of Singapore, Microsoft

🔥 Update

[2024.12.18] Upload the UI-Graph preprocessing and visualization code.
[2024.12.15] ShowUI received Outstanding Paper Award at NeurIPS2024 Open-World Agents workshop.
[2024.12.9] Support int8 Quantization.
[2024.12.5] Major Update: ShowUI is integrated into OOTB for local run!
[2024.12.1] We support iterative refinement to improve grounding accuracy. Try it at HF Spaces demo.
[2024.11.27] We release the arXiv paper, HF Spaces demo and ShowUI-desktop-8K.
[2024.11.16] showlab/ShowUI-2B is available at huggingface.

🖥️ Computer Use

See Computer Use OOTB for using ShowUI to control your PC.

computer_use_with_showui-en-s.mp4

⭐ Quick Start

See Quick Start for model usage.

🤗 Local Gradio

See Gradio for installation.

BibTeX

If you find our work helpful, please consider citing our paper.

@misc{lin2024showui,
      title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent}, 
      author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou},
      year={2024},
      eprint={2411.17465},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17465}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ShowUI

🔥 Update

🖥️ Computer Use

⭐ Quick Start

🤗 Local Gradio

BibTeX

Files

README.md

Latest commit

History

README.md

File metadata and controls

ShowUI

🔥 Update

🖥️ Computer Use

⭐ Quick Start

🤗 Local Gradio

BibTeX