Configurable hf cache #30

picobyte · 2023-07-23T13:43:19Z

pick up huggingface cache dir from environment variables, if set, or use a

default, but keep it configurable, via settings.

picobyte · 2023-07-23T13:52:15Z

tagger/settings.py

        info=shared.OptionInfo(
-            False,
-            label='Unload tensorflow models from memory (experimental).',


nevermind, this is ok, A left over clean up of the experimental unloading, that wasn't working. see
6b26d2d

subclass of HFInterrogator. Allow more HuggingFace parameters for who can use them. The user can set HF_HUB_OFFLINE, then, or if the connection cannot be made, the fallback is the local directory. If that does not exist either, just stop the interrogation empty handed.

and pick up the configured interrogators there. presets in tagger/presets.py and tagger/utils.py can go. write info alongside model so we can check its up to date status

WSH032 · 2023-07-24T06:53:44Z

tagger/preset.py

+
+
+preset = Preset(Path(
+    os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger/presets')


refer to AUTOMATIC1111/stable-diffusion-webui/wiki/Developing-extensions

from modules import scripts str(scripts.basedir()) # the same to os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger)

How about

os.path.join(scripts.basedir(), 'presets')

I tried this in first instance, but somehow got to a folder, 'stable-diffusion-webui/presets'

(same person BTW)

Then, how about this:

# the same to os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger) Path(__file__).parent.parent

But this is a bit ugly

yes that will work also, I will adapt it. Is extension_dir problematic?

extension_dir is ok. But "stable-diffusion-webui-wd14-tagger" may be problematic.

Because users may change the name of the folder

I have it adapted, amongst other changes, will add the changes later today and merge, if all is well.

WSH032 · 2023-07-24T08:03:19Z

tagger/interrogator.py

+                for i, filen in enumerate([self.model_path, self.tags_path]):
+                    self.hf_params['filename'] = filen
+                    paths[i] = hf_hub_download(**self.hf_params)
+            except Exception as err:


refer to huggingface_hub/file_download.py
This error seems to have been caught.

It seems that it can be used offline without causing any errors, if the environment variable HF_HUB_OFFLINE = True is set, or hf_hub_download(local_files_only = True) is set.

for the latter, this can now be set in Settings -> Tagger -> HuggingFace parameters (see other comment) I still think we want to try to fall back to local dir if hf fails.

WSH032 · 2023-07-24T08:14:08Z

So the functionality of this feature branch is like this:

If the users can’t even use hf_hub_download to download the model for the first time, they can use the files manually downloaded from the huggingface website by setting local_model and local_tags in interrogators.json, thus avoiding any calls to hf_hub_download (because there is no cache at this point).
If the users have used hf_hub_download to download and cache the model files before, in their subsequent use, when proxy or network problems occur, they can set the environment variable HF_HUB_OFFLINE = True to use the previous cache in offline mode, without causing any errors.

Am I right?

picobyte · 2023-07-24T16:46:07Z

tagger/interrogator.py

+                        f'cache_dir="{Its.hf_cache}"')
+        attrs = [attr.split('=') for attr in map(str.strip, attrs.split(','))]
+
+        signature = inspect.signature(hf_hub_download)


Here I read all available huggingface parameters. I allow the user to set some in the interrogators.json: repository/revision/library specific information. Other information I consider as generic.and those can now be set in Settings -> Tagger -> HuggingFace parameters. I could even allow overriding some more of these settings per repository via the json, but I'm not sure yet what would be sensible to allow to override.

If all is well the type should be evaluated, however something may go wring if the user specifies a wrong type for a variable, this then ends up as string, and hf_hub_download may barf. But the Settings tab indicates that user should be conscious about their entries, and a lot of mistakes are already covered. Also note that this is all done without eval(), which can potentially be dangerous.

So to clearly answer your question (sorry, late answer) yes, that's how it should work. please note that the environment variable needs to be set in the active virtual environment, if you have it, or your run script. I could also add a --hub-offline flag to automate it. In the console use export HF_HUB_OFFLINE=1 by the way (no spaces, and 1 is more common, though True might just work, too). In python os.environ["HF_HUB_OFFLINE"] = "1" should do the same.

and pick up the configured interrogators there. presets in tagger/presets.py and tagger/utils.py can go. write info alongside model so we can check its up to date status # Conflicts: # tagger/ui.py # tagger/utils.py # Conflicts: # tagger/interrogator.py # tagger/ui.py

# Conflicts: # preload.py # tagger/interrogator.py # tagger/preset.py # tagger/ui.py

This reverts commit 0c1fd97.

# Conflicts: # tagger/ui.py

…not, if name_in_queue is given, the interrogation is queued under that name. The response is the number of all processed for all active queues & all models. If name_in_queue is empty, the queue is marked as finished, A response is awaited for remaining interrogatioons. The response is only for this queue. at least that's how it's supposed to work. no testing yet, except compile.

The name may be empty, better ignore it for single-image. The deeper nested object is required, because the queued query requires a per name interrogation, so maybe better to separate tag from rating, for single. To allow distinction prepend ratings in batch query with 'rating:'

when the queue return is asked (by not providing a name)

variable i was shadowed then in order corrected the issues: only before interrogation the image needs to be decoded. TypeError: object TaggerInterrogateResponse can't be used in 'await' expression TypeError: 'TaggerInterrogateResponse' object is not subscriptable and now there's something not awaited.. (no completion)

because the finish works, even for two images. Concurrent queues still seem to fail too, however.

queue name, not in use.

# Conflicts: # tagger/settings.py # Conflicts: # tagger/interrogator.py # tagger/utils.py

and pick up the configured interrogators there. presets in tagger/presets.py and tagger/utils.py can go. write info alongside model so we can check its up to date status # Conflicts: # tagger/ui.py # tagger/utils.py # Conflicts: # tagger/interrogator.py # tagger/ui.py # Conflicts: # tagger/api.py # tagger/ui.py # Conflicts: # tagger/interrogator.py # tagger/preset.py

using json schema and json entries. But I'd like to

picobyte commented Jul 23, 2023

View reviewed changes

Roel Kluin added 3 commits July 23, 2023 21:37

a little more work is required

714c335

add interrogators.json move refresh to interrogator as a static,

e4c056a

and pick up the configured interrogators there. presets in tagger/presets.py and tagger/utils.py can go. write info alongside model so we can check its up to date status

WSH032 reviewed Jul 24, 2023

View reviewed changes

picobyte commented Jul 24, 2023

View reviewed changes

Roel Kluin and others added 22 commits July 25, 2023 16:53

cleanup

520915b

Merge branch 'master' into configurable_hf_cache

4d82af3

As suggested by WSH032

1d71a8e

As WSH032 mentioned, this is already caught.

50203aa

This was a bug

543fd6e

allow json to override settings

3122bfd

broken

9c6df86

Merge branch 'master' into configurable_hf_cache

d6dac05

Style is deprecated

fe9f3fa

improve wording, this is just a warning

1845c72

cleanups

1f7ef93

gvi

07cd9bd

a little more work is required

77797f1

fix again

f7f8f18

Merge branch 'configurable_hf_cache' into configurable_hf_cache2

0caa420

# Conflicts: # preload.py # tagger/interrogator.py # tagger/preset.py # tagger/ui.py

Actually use --additional-device-ids arg

39d4fd2

currently only one device is supported

ae8345b

Revert "improve types and other cleanups"

aa92dba

This reverts commit 0c1fd97.

Use a state variable to be able to send the tags via txt2img and friends

63892a5

Correct nr of args in error returns

b5d5e6b

# Conflicts: # tagger/ui.py

recursive glob was failing for directories

fcbc59b

Roel Kluin added 27 commits September 16, 2023 18:42

add missing api_model defaults

83614c1

no running event loop

0be994c

first success with running using queue, but the queue is only started

c1ce31e

when the queue return is asked (by not providing a name)

This does something more, but still some tasks do not complete.

58f76e8

although an interrogation does not complete, it is queued and executed

05901ad

because the finish works, even for two images. Concurrent queues still seem to fail too, however.

(off-topic) allow grepping from stdin

6407d48

finally some progress

803ef56

fix name clobber issue

b057db1

prevent sha256 dup

54add7f

For the first image, if no queue name is provided, generate a random

cc1768a

queue name, not in use.

move bash_scripts/tag_based_image_dedup.sh

4579951

update chagelog

34759d6

bump version

0ef398a

gvi

cc4b4c2

# Conflicts: # tagger/settings.py # Conflicts: # tagger/interrogator.py # tagger/utils.py

broken

571c56c

cleanups

0880cd9

Fixes, Tagger is loaded without initialization errors

b564858

clean up

06927ea

fix

5d470ef

clean up

8bbad93

this seems to work again, requires json edits for custom models, though

2741435

add fromfile interrogator

6d68c96

fix fromfileinterrogator

486edca

picobyte force-pushed the configurable_hf_cache branch from d9c6e4a to 486edca Compare September 20, 2023 16:02

Roel Kluin added 2 commits September 20, 2023 19:01

fix

ac32226

The intent was to allow editing the interrogator properties in settings

6262435

using json schema and json entries. But I'd like to

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable hf cache #30

Configurable hf cache #30

picobyte commented Jul 23, 2023

picobyte Jul 23, 2023 •

edited

Loading

WSH032 Jul 24, 2023 •

edited

Loading

RoelKluin Jul 24, 2023

picobyte Jul 24, 2023 •

edited

Loading

WSH032 Jul 24, 2023

picobyte Jul 25, 2023

WSH032 Jul 26, 2023

picobyte Jul 26, 2023

WSH032 Jul 24, 2023

picobyte Jul 24, 2023 •

edited

Loading

WSH032 commented Jul 24, 2023

picobyte Jul 24, 2023 •

edited

Loading

picobyte Sep 19, 2023 •

edited

Loading



		preset = Preset(Path(
		os.path.join(extensions_dir, 'stable-diffusion-webui-wd14-tagger/presets')

Configurable hf cache #30

Are you sure you want to change the base?

Configurable hf cache #30

Conversation

picobyte commented Jul 23, 2023

picobyte Jul 23, 2023 • edited Loading

Choose a reason for hiding this comment

WSH032 Jul 24, 2023 • edited Loading

Choose a reason for hiding this comment

RoelKluin Jul 24, 2023

Choose a reason for hiding this comment

picobyte Jul 24, 2023 • edited Loading

Choose a reason for hiding this comment

WSH032 Jul 24, 2023

Choose a reason for hiding this comment

picobyte Jul 25, 2023

Choose a reason for hiding this comment

WSH032 Jul 26, 2023

Choose a reason for hiding this comment

picobyte Jul 26, 2023

Choose a reason for hiding this comment

WSH032 Jul 24, 2023

Choose a reason for hiding this comment

picobyte Jul 24, 2023 • edited Loading

Choose a reason for hiding this comment

WSH032 commented Jul 24, 2023

picobyte Jul 24, 2023 • edited Loading

Choose a reason for hiding this comment

picobyte Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

picobyte Jul 23, 2023 •

edited

Loading

WSH032 Jul 24, 2023 •

edited

Loading

picobyte Jul 24, 2023 •

edited

Loading

picobyte Jul 24, 2023 •

edited

Loading

picobyte Jul 24, 2023 •

edited

Loading

picobyte Sep 19, 2023 •

edited

Loading