Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The HuggingFace Demo has been ignoring the images I've uploaded #1

Open
hben35096 opened this issue Mar 14, 2024 · 21 comments
Open

The HuggingFace Demo has been ignoring the images I've uploaded #1

hben35096 opened this issue Mar 14, 2024 · 21 comments

Comments

@hben35096
Copy link

hben35096 commented Mar 14, 2024

https://huggingface.co/spaces/naver-ai/VisualStylePrompting
PixPin_2024-03-15_03-43-39
I tried 6 times.

@screan
Copy link

screan commented Mar 15, 2024

same, wont take an uploaded image.

@SoftologyPro
Copy link

SoftologyPro commented Mar 19, 2024

Yes, please add the functionality for the user to specify their own style image without having to modify config files.
Should be simple, pick image, type prompt, generate.
The first thing users want after running the examples is "that's cool, how can I use my own style image now"?

@Joyofmovement
Copy link

Please consider making it possible for users to be able to use their own images as a style, and it be simple to do so, many thanks. I really like this concept though, it's great.
Thanks for your contribution.

@taki0112
Copy link
Collaborator

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.

We will update the demo code to support this by utilizing BLIP2.

@SoftologyPro
Copy link

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.

We will update the demo code to support this by utilizing BLIP2.

That would work. User picks one of their images, BLIP2 captions it, user should get an option to modify the detected caption if need be, then the user image can be used to style any other image.

@dhmiller123
Copy link

This will be very helpful. Thank you. Looking forward to working with my own images.

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.

We will update the demo code to support this by utilizing BLIP2.

@taki0112
Copy link
Collaborator

  • There is an issue about HF gpu, so HF is currently fixing it.
  • For this reason, the features for user image styles have been implemented, but not executed in the demo.
  • In now, Try vsp_real_script.py

@dhmiller123
Copy link

dhmiller123 commented Mar 26, 2024 via email

@SoftologyPro
Copy link

  • There is an issue about HF gpu, so HF is currently fixing it.

    • For this reason, the features for user image styles have been implemented, but not executed in the demo.

    • In now, Try vsp_real_script.py

Can you make an updated app.py for local running? I am trying to do this all local on Windows, so it doesn't matter if it does not run as a HF online demo.

@dhmiller123
Copy link

dhmiller123 commented Mar 28, 2024 via email

@taki0112
Copy link
Collaborator

taki0112 commented Apr 1, 2024

@dhmiller123 @SoftologyPro
In local, you can try with vsp_real_script.py

@SoftologyPro
Copy link

@dhmiller123 @SoftologyPro
In local, you can try with vsp_real_script.py

I understand, but if you updated the gradio UI with that functionality it would make it easier for all users.

@taki0112
Copy link
Collaborator

taki0112 commented Apr 1, 2024

We have recently updated the demo to reflect user images. However, due to an issue with the GPU provided by Hugging Face (HF), the functionality is not performing as expected. We have no choice but to wait until HF resolves this issue.

@SoftologyPro
Copy link

SoftologyPro commented Apr 1, 2024

OK, I understand that too. But, I don't want to run via huggingface. I want to run your gradio demo locally under Windows. If you do have a version of the gradio app.py that works locally then please do share. The only version of app.py I have is from before which has now been removed from your repo.

@SoftologyPro
Copy link

SoftologyPro commented Apr 1, 2024

ie the attached version app.py (renamed app.txt as py files do not seem to be attachable). Running locally. That should get around any huggingface limitations?

app.txt
Screenshot 2024-04-01 183632

@taki0112
Copy link
Collaborator

taki0112 commented Apr 2, 2024

demo is working now.

@dhmiller123
Copy link

dhmiller123 commented Apr 2, 2024 via email

@SoftologyPro
Copy link

SoftologyPro commented Apr 2, 2024

OK, when I try and run the HF demo with my own style image I get GPU timeouts. Can you provide a working version of app.py to run local? This is what I tried...

git clone https://huggingface.co/spaces/naver-ai/VisualStylePrompting
In app.py I had to remark the first line import spaces and the other @spaces.GPU line.
Then running app.py opens the UI

I select my own style image, set a prompt, set the outputs to 1 and click Submit.
Gives these errors (same as the other issue I raised wiith vsp_real_script.py) #7

Traceback (most recent call last):
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\queueing.py", line 501, in call_prediction
    output = await route_utils.call_process_api(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\route_utils.py", line 253, in call_process_api
    output = await app.get_blocks().process_api(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\blocks.py", line 1695, in process_api
    result = await self.call_function(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\blocks.py", line 1235, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\anyio\_backends\_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\gradio\utils.py", line 692, in wrapper
    response = f(*args, **kwargs)
  File "<path to local clone>Visual Style Prompting\app.py", line 156, in style_fn
    ref_prompt = blip_inf_prompt(origin_real_img)
  File "<path to local clone>Visual Style Prompting\app.py", line 77, in blip_inf_prompt
    generated_ids = blip_model.generate(**inputs)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\transformers\models\blip_2\modeling_blip_2.py", line 1830, in generate
    outputs = self.language_model.generate(
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\transformers\generation\utils.py", line 1466, in generate
    self._validate_generated_length(generation_config, input_ids_length, has_default_max_length)
  File "<path to local clone>venv\voc_visualstyleprompting\lib\site-packages\transformers\generation\utils.py", line 1186, in _validate_generated_length
    raise ValueError(
ValueError: Input length of input_ids is 0, but `max_length` is set to -13. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`.

If I then click the watercolor horse/tiger example and click Submit it works.

If I then select my own style image again and click Submit it does not crash, but still uses the previous watercolor style and ignores my style image.

Screenshot 2024-04-03 095343

@SoftologyPro
Copy link

OK, for those wanting to run this locally, I finally got it working after trying various package versions until these worked.

python -m pip install --upgrade pip
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts wheel==0.41.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts diffusers==0.27.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts accelerate==0.28.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts einops==0.7.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts kornia==0.7.2
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts gradio==4.25.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts transformers==4.39.3
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts opencv-python==4.9.0.80
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts xformers==0.0.25 --index-url https://download.pytorch.org/whl/cu118
pip uninstall -y torch
pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==2.2.1+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

https://softologyblog.wordpress.com/2023/10/10/a-plea-to-all-python-developers/

@SoftologyPro
Copy link

Running locally under Windows 11 on a 4090.
Screenshot 2024-04-03 190125

@SoftologyPro
Copy link

To accurately reflect the style of the user image, a description of that image is necessary. Some users may struggle to write effective descriptions, we have not included this aspect in the demo.
We will update the demo code to support this by utilizing BLIP2.

I think BLIP may also struggle to write an effective description too? Would it help to show the detected caption and allow the user to edit it before use? When an example is clicked, show the caption used for those too.

Here are some "failed" results that may help to have a better caption text for the style images?

Do you think these results are due to the caption or just a bad style image choice?

The broccoli image was BLIP captioned "broccoli is a vegetable that is very popular". Would a better prompt help get a better styled result? Maybe just broccoli.

The wave image was captioned. "a large wave breaking on the ocean"

Those 2 and the tiger above are not as "clean" as the example results. For the tiger above I expected textures that matched the style image. Would a better caption help there?

Screenshot 2024-04-04 110536 Screenshot 2024-04-04 110849

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants