You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's been a while, and training up and quit working for some unknown reason, so I tried a fresh install, and nothing changed. To make matters worse I tried to delete the default_config.yaml file, and reconfigure it through setup.bat, and unlike last time, no dice. Sadly I'm a bit stupid when it comes to this programming stuff, but I gave a look through the logs, realized I didn't have SD-Scripts in my folder, and that didn't fix it either...nice try though. I'm running short on what worked, so after getting everything back to it's pre-configured state, here is the latest log.
20:40:03-867615 INFO Kohya_ss GUI version: v24.1.7
fatal: not a git repository (or any of the parent directories): .git
20:40:04-132615 ERROR Error during Git operation: Command '['git', 'submodule', 'update', '--init', '--recursive',
'--quiet']' returned non-zero exit status 128.
20:40:04-135615 INFO nVidia toolkit detected
20:40:05-446617 INFO Torch 2.1.2+cu118
20:40:05-460618 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8905
20:40:05-462615 INFO Torch detected GPU: NVIDIA GeForce RTX 4090 VRAM 24564 Arch (8, 9) Cores 128
20:40:05-466618 INFO Python version is 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit
(AMD64)]
20:40:05-467617 INFO Verifying modules installation status from requirements_pytorch_windows.txt...
20:40:05-469619 INFO Verifying modules installation status from requirements_windows.txt...
20:40:05-470616 INFO Verifying modules installation status from requirements.txt...
20:40:12-111332 INFO headless: False
20:40:12-146331 INFO Using shell=True when running external commands...
M:\kohya_ss\venv\lib\site-packages\gradio\analytics.py:106: UserWarning: IMPORTANT: You are using gradio version 4.43.0, however version 4.44.1 is available, please upgrade.
--------
warnings.warn(
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
20:40:23-479989 INFO Loading config...
20:40:25-151985 INFO Start training LoRA Standard ...
20:40:25-152985 INFO Validating lr scheduler arguments...
20:40:25-153984 INFO Validating optimizer arguments...
20:40:25-154984 INFO Validating M:\SampleImages\SDXL\Konata\Log existence and writability... SUCCESS
20:40:25-154984 INFO Validating M:\SampleImages\SDXL\Konata\Model existence and writability... SUCCESS
20:40:25-155985 INFO Validating M:/Forge/webui/models/Stable-diffusion/SDXL/hentaiMixXLRoadTo_v50.safetensors
existence... SUCCESS
20:40:25-156985 INFO Validating M:\SampleImages\SDXL\Konata\Images existence... SUCCESS
20:40:25-158985 INFO Folder 2_Konata: 2 repeats found
20:40:25-159985 INFO Folder 2_Konata: 20 images found
20:40:25-160985 INFO Folder 2_Konata: 20 * 2 = 40 steps
20:40:25-161986 INFO Regulatization factor: 1
20:40:25-161986 INFO Total steps: 40
20:40:25-162984 INFO Train batch size: 1
20:40:25-163984 INFO Gradient accumulation steps: 1
20:40:25-163984 INFO Epoch: 60
20:40:25-164984 INFO max_train_steps (40 / 1 / 1 * 60 * 1) = 2400
20:40:25-165985 INFO stop_text_encoder_training = 0
20:40:25-166983 INFO lr_warmup_steps = 0
20:40:25-167985 INFO Saving training config to M:\SampleImages\SDXL\Konata\Model\Konata_20241229-204025.json...
20:40:25-169986 INFO Executing command: M:\kohya_ss\venv\Scripts\accelerate.EXE launch --dynamo_backend no
--dynamo_mode default --mixed_precision bf16 --num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2 M:/kohya_ss/sd-scripts/sdxl_train_network.py --config_file
M:\SampleImages\SDXL\Konata\Model/config_lora-20241229-204025.toml
20:40:25-172985 INFO Command executed.
2024-12-29 20:40:33 INFO Loading settings from train_util.py:4519
M:\SampleImages\SDXL\Konata\Model/config_lora-20241229-204025.toml...
INFO M:\SampleImages\SDXL\Konata\Model/config_lora-20241229-204025 train_util.py:4538
2024-12-29 20:40:34 INFO Using DreamBooth method. train_network.py:325
INFO prepare images. train_util.py:1971
INFO get image size from name of cache files train_util.py:1886
100%|██████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<?, ?it/s]
INFO set image size from cache files: 0/20 train_util.py:1916
INFO found directory M:\SampleImages\SDXL\Konata\Images\2_Konata contains 20 train_util.py:1918
image files
read caption: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 3275.52it/s]
INFO 40 train images with repeating. train_util.py:2012
INFO 0 reg images. train_util.py:2015
WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:2020
INFO [Dataset 0] config_util.py:567
batch_size: 1
resolution: (1024, 1024)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 64
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "M:\SampleImages\SDXL\Konata\Images\2_Konata"
image_count: 20
num_repeats: 2
shuffle_caption: True
keep_tokens: 1
keep_tokens_separator:
caption_separator: ,
secondary_separator: None
enable_wildcard: False
caption_dropout_rate: 0.0
caption_dropout_every_n_epochs: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1
token_warmup_step: 0
alpha_mask: False
custom_attributes: {}
is_reg: False
class_tokens: Konata
caption_extension: .txt
INFO [Dataset 0] config_util.py:573
INFO loading image sizes. train_util.py:923
100%|████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 4115.90it/s]
INFO make buckets train_util.py:946
WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is train_util.py:963
set, because bucket reso is defined by image size automatically /
bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計
算されるため、min_bucket_resoとmax_bucket_resoは無視されます
INFO number of images (including repeats) / train_util.py:992
各bucketの画像枚数(繰り返し回数を含む)
INFO bucket 0: resolution (768, 1216), count: 2 train_util.py:997
INFO bucket 1: resolution (768, 1280), count: 2 train_util.py:997
INFO bucket 2: resolution (832, 1152), count: 12 train_util.py:997
INFO bucket 3: resolution (832, 1216), count: 4 train_util.py:997
INFO bucket 4: resolution (896, 1088), count: 2 train_util.py:997
INFO bucket 5: resolution (896, 1152), count: 2 train_util.py:997
INFO bucket 6: resolution (960, 1024), count: 2 train_util.py:997
INFO bucket 7: resolution (1024, 960), count: 2 train_util.py:997
INFO bucket 8: resolution (1024, 1024), count: 2 train_util.py:997
INFO bucket 9: resolution (1088, 832), count: 2 train_util.py:997
INFO bucket 10: resolution (1152, 832), count: 2 train_util.py:997
INFO bucket 11: resolution (1216, 768), count: 4 train_util.py:997
INFO bucket 12: resolution (1216, 832), count: 2 train_util.py:997
INFO mean ar error (without repeats): 0.013208133072419337 train_util.py:1002
WARNING clip_skip will be unexpected / SDXL学習ではclip_skipは動作しません sdxl_train_util.py:351
INFO preparing accelerator train_network.py:379
accelerator device: cuda
INFO loading model for process 0/1 sdxl_train_util.py:32
INFO load StableDiffusion checkpoint: sdxl_train_util.py:73
M:/Forge/webui/models/Stable-diffusion/SDXL/hentaiMixXLRoadTo_v50.saf
etensors
2024-12-29 20:40:35 INFO building U-Net sdxl_model_util.py:198
INFO loading U-Net from checkpoint sdxl_model_util.py:202
2024-12-29 20:40:52 INFO U-Net: <All keys matched successfully> sdxl_model_util.py:208
INFO building text encoders sdxl_model_util.py:211
INFO loading text encoders from checkpoint sdxl_model_util.py:264
2024-12-29 20:40:53 INFO text encoder 1: <All keys matched successfully> sdxl_model_util.py:278
2024-12-29 20:40:59 INFO text encoder 2: <All keys matched successfully> sdxl_model_util.py:282
INFO building VAE sdxl_model_util.py:285
INFO loading VAE from checkpoint sdxl_model_util.py:290
INFO VAE: <All keys matched successfully> sdxl_model_util.py:293
import network module: networks.lora
2024-12-29 20:41:00 INFO [Dataset 0] train_util.py:2495
INFO caching latents with caching strategy. train_util.py:1048
INFO caching latents... train_util.py:1097
100%|████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 2500.18it/s]
2024-12-29 20:41:02 INFO create LoRA network. base dim (rank): 32, alpha: 32 lora.py:935
INFO neuron dropout: p=None, rank dropout: p=None, module dropout: p=None lora.py:936
INFO create LoRA for Text Encoder 1: lora.py:1027
INFO create LoRA for Text Encoder 2: lora.py:1027
INFO create LoRA for Text Encoder: 264 modules. lora.py:1035
2024-12-29 20:41:03 INFO create LoRA for U-Net: 722 modules. lora.py:1043
INFO enable LoRA for text encoder: 264 modules lora.py:1084
INFO enable LoRA for U-Net: 722 modules lora.py:1089
prepare optimizer, data loader etc.
INFO use AdamW optimizer | {} train_util.py:4872
Traceback (most recent call last):
File "M:\kohya_ss\sd-scripts\sdxl_train_network.py", line 228, in <module>
trainer.train(args)
File "M:\kohya_ss\sd-scripts\train_network.py", line 571, in train
lr_scheduler = train_util.get_scheduler_fix(args, optimizer, accelerator.num_processes)
File "M:\kohya_ss\sd-scripts\library\train_util.py", line 5128, in get_scheduler_fix
if name == SchedulerType.COSINE_WITH_MIN_LR:
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\enum.py", line 437, in __getattr__
raise AttributeError(name) from None
AttributeError: COSINE_WITH_MIN_LR
Traceback (most recent call last):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "M:\kohya_ss\venv\Scripts\accelerate.EXE\__main__.py", line 7, in <module>
sys.exit(main())
File "M:\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "M:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "M:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['M:\\kohya_ss\\venv\\Scripts\\python.exe', 'M:/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', 'M:\\SampleImages\\SDXL\\Konata\\Model/config_lora-20241229-204025.toml']' returned non-zero exit status 1.
20:41:05-828023 INFO Training has ended.
The text was updated successfully, but these errors were encountered:
it sometimes happens to me, i just press start train again , and it starts. But my trainings stops with same errors in the middle of a training, completely random times.
Somehow that worked. I don't know why it worked, since it worked fine the last time I used it, but somehow it works now. I don't think I updated anything, so I'm at a loss as to what is going on here.
Also, when I say random, I mean it worked fine a month ago, and then up and dies on me today, with nothing being changed in between. This program, as useful as it is, is temperamental at best, and seems to only work when it wants to. At least it wasn't something I overlooked like running activate.bat with admin privileges. 🙄
It's been a while, and training up and quit working for some unknown reason, so I tried a fresh install, and nothing changed. To make matters worse I tried to delete the default_config.yaml file, and reconfigure it through setup.bat, and unlike last time, no dice. Sadly I'm a bit stupid when it comes to this programming stuff, but I gave a look through the logs, realized I didn't have SD-Scripts in my folder, and that didn't fix it either...nice try though. I'm running short on what worked, so after getting everything back to it's pre-configured state, here is the latest log.
The text was updated successfully, but these errors were encountered: