Skip to content
/ ShowUI Public

Repository for ShowUI: One Vision-Language-Action Model for GUI Visual Agent

License

Notifications You must be signed in to change notification settings

showlab/ShowUI

Repository files navigation

ShowUI

ShowUI

🤗 Hugging Models   |    📑 Paper    |    🤗 Spaces Demo    |    🕹️ OpenBayes贝式计算   
🤗 Datasets   |   💬 X (Twitter)   |    🖥️ Computer Use    |    📖 GUI Paper List   

ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Weixian Lei, Lijuan Wang, Mike Zheng Shou
Show Lab @ National University of Singapore, Microsoft

🔥 Update

🖥️ Computer Use

See Computer Use OOTB for using ShowUI to control your PC.

computer_use_with_showui-en-s.mp4

⭐ Quick Start

See Quick Start for model usage.

🤗 Local Gradio

See Gradio for installation.

BibTeX

If you find our work helpful, please consider citing our paper.

@misc{lin2024showui,
      title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent}, 
      author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou},
      year={2024},
      eprint={2411.17465},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17465}, 
}