Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ShowUI Slow speed on M1 16GB MBP? #46

Open
aristideubertas opened this issue Dec 6, 2024 · 3 comments
Open

ShowUI Slow speed on M1 16GB MBP? #46

aristideubertas opened this issue Dec 6, 2024 · 3 comments

Comments

@aristideubertas
Copy link

I am running ShowUI on my M1 MBP with 16GB of RAM and I have noticed that it is very very slow at performing actions when using ShowUI + GPT-4o. I am wondering if this is just on my machine, of it the M1 is not powerful enough to use this appropriately.

Basically, it is too slow to actually use. I have installed mps PyTorch so that shouldn't be a problem.

Any way I can benchmark it locally?

@yyyang-2019
Copy link
Collaborator

Hi @aristideubertas,
We performed tests on an M2 MBP with 32GB RAM, with about 15~20s per step for a 1920x1080px screenshot. We have to admit that though lightweighted, ShowUI is still a (large) VLM with 2B params, so local inference surely takes some time. For faster response, you can try quantization on ShowUI and see if it still does the job well!

@aristideubertas
Copy link
Author

Hi @yyyang-2019 thanks for your answer.

Any idea if lowering the resolution will help cut down the latency significantly?

For many users it might be useful to have the inference run on a remote machine - is this something that you are thinking of supporting in the future? eg. remote showui api endpoint

@yyyang-2019
Copy link
Collaborator

yyyang-2019 commented Dec 7, 2024

  1. For ShowUI, directly lower the resolution may not speed up the inference since its(Qwen's) tokenization, as the main computational cost is not determined by pixel-level operations. You can try to modify min_pixels and max_pixels of ShowUI to balance speed and memory. Besides, quantization the model may also be helpful.

  2. Great suggestions! But surely some privacy issues remains. Thus we are still focusing on improving locally-runnings :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants