Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable hf cache #30

Open
wants to merge 81 commits into
base: master
Choose a base branch
from
Open

Configurable hf cache #30

wants to merge 81 commits into from

Conversation

picobyte
Copy link
Owner

pick up huggingface cache dir from environment variables, if set, or use a

default, but keep it configurable, via settings.

info=shared.OptionInfo(
False,
label='Unload tensorflow models from memory (experimental).',
Copy link
Owner Author

@picobyte picobyte Jul 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind, this is ok, A left over clean up of the experimental unloading, that wasn't working. see
6b26d2d

Roel Kluin added 3 commits July 23, 2023 21:37
subclass of HFInterrogator. Allow more HuggingFace parameters for who can
use them. The user can set HF_HUB_OFFLINE, then, or if the connection
cannot be made, the fallback is the local directory. If that does not
exist either, just stop the interrogation empty handed.
and pick up the configured interrogators there.

presets in tagger/presets.py and tagger/utils.py can go.

write info alongside model so we can check its up to date status
tagger/preset.py Outdated


preset = Preset(Path(
os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger/presets')
Copy link

@WSH032 WSH032 Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer to AUTOMATIC1111/stable-diffusion-webui/wiki/Developing-extensions

from modules import scripts
str(scripts.basedir())  # the same to os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger)

How about

os.path.join(scripts.basedir(), 'presets')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this in first instance, but somehow got to a folder, 'stable-diffusion-webui/presets'

Copy link
Owner Author

@picobyte picobyte Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same person BTW)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, how about this:

# the same to os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger)
Path(__file__).parent.parent

But this is a bit ugly

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that will work also, I will adapt it. Is extension_dir problematic?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extension_dir is ok. But "stable-diffusion-webui-wd14-tagger" may be problematic.

Because users may change the name of the folder

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have it adapted, amongst other changes, will add the changes later today and merge, if all is well.

for i, filen in enumerate([self.model_path, self.tags_path]):
self.hf_params['filename'] = filen
paths[i] = hf_hub_download(**self.hf_params)
except Exception as err:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer to huggingface_hub/file_download.py
This error seems to have been caught.

It seems that it can be used offline without causing any errors, if the environment variable HF_HUB_OFFLINE = True is set, or hf_hub_download(local_files_only = True) is set.

Copy link
Owner Author

@picobyte picobyte Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the latter, this can now be set in Settings -> Tagger -> HuggingFace parameters (see other comment) I still think we want to try to fall back to local dir if hf fails.

@WSH032
Copy link

WSH032 commented Jul 24, 2023

So the functionality of this feature branch is like this:

  1. If the users can’t even use hf_hub_download to download the model for the first time, they can use the files manually downloaded from the huggingface website by setting local_model and local_tags in interrogators.json, thus avoiding any calls to hf_hub_download (because there is no cache at this point).
  2. If the users have used hf_hub_download to download and cache the model files before, in their subsequent use, when proxy or network problems occur, they can set the environment variable HF_HUB_OFFLINE = True to use the previous cache in offline mode, without causing any errors.

Am I right?

f'cache_dir="{Its.hf_cache}"')
attrs = [attr.split('=') for attr in map(str.strip, attrs.split(','))]

signature = inspect.signature(hf_hub_download)
Copy link
Owner Author

@picobyte picobyte Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I read all available huggingface parameters. I allow the user to set some in the interrogators.json: repository/revision/library specific information. Other information I consider as generic.and those can now be set in Settings -> Tagger -> HuggingFace parameters. I could even allow overriding some more of these settings per repository via the json, but I'm not sure yet what would be sensible to allow to override.

If all is well the type should be evaluated, however something may go wring if the user specifies a wrong type for a variable, this then ends up as string, and hf_hub_download may barf. But the Settings tab indicates that user should be conscious about their entries, and a lot of mistakes are already covered. Also note that this is all done without eval(), which can potentially be dangerous.

Copy link
Owner Author

@picobyte picobyte Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to clearly answer your question (sorry, late answer) yes, that's how it should work. please note that the environment variable needs to be set in the active virtual environment, if you have it, or your run script. I could also add a --hub-offline flag to automate it. In the console use export HF_HUB_OFFLINE=1 by the way (no spaces, and 1 is more common, though True might just work, too). In python os.environ["HF_HUB_OFFLINE"] = "1" should do the same.

Roel Kluin added 27 commits September 16, 2023 18:42
…not,

if name_in_queue is given, the interrogation is queued under that name. The
response is the number of all processed for all active queues & all models.
If name_in_queue is empty, the queue is marked as finished, A response is
awaited for remaining interrogatioons. The response is only for this queue.

at least that's how it's supposed to work. no testing yet, except compile.
The name may be empty, better ignore it for single-image. The deeper nested
object is required, because the queued query requires a per name
interrogation, so maybe better to separate tag from rating, for single.

To allow distinction prepend ratings in batch query with 'rating:'
when the queue return is asked (by not providing a name)
variable i was shadowed
then in order corrected the issues:
only before interrogation the image needs to be decoded.

TypeError: object TaggerInterrogateResponse can't be used in 'await'
expression
TypeError: 'TaggerInterrogateResponse' object is not subscriptable

and now there's something not awaited.. (no completion)
because the finish works, even for two images. Concurrent  queues still
seem to fail too, however.
gvi
# Conflicts:
#	tagger/settings.py

# Conflicts:
#	tagger/interrogator.py
#	tagger/utils.py
and pick up the configured interrogators there.

presets in tagger/presets.py and tagger/utils.py can go.

write info alongside model so we can check its up to date status

# Conflicts:
#	tagger/ui.py
#	tagger/utils.py

# Conflicts:
#	tagger/interrogator.py
#	tagger/ui.py

# Conflicts:
#	tagger/api.py
#	tagger/ui.py

# Conflicts:
#	tagger/interrogator.py
#	tagger/preset.py
@picobyte picobyte force-pushed the configurable_hf_cache branch from d9c6e4a to 486edca Compare September 20, 2023 16:02
Roel Kluin added 2 commits September 20, 2023 19:01
using json schema and json entries. But I'd like to
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants