Skip to content
cmdr2 edited this page Apr 5, 2023 · 14 revisions

📢 Request for contributors: Please consider contributing code to make this API doc a sphinx documentation, instead of a wiki page!

Core API

Overall architecture

Broadly, the API contains 5 modules:

sdkit.models # load/unloading models, downloading known models, scanning models
sdkit.generate # generating images
sdkit.filter # face restoration, upscaling
sdkit.train # model merge, and (in the future) more training methods
sdkit.utils

And a sdkit.Context object is passed around, which encapsulates the data related to the runtime (e.g. device and vram_optimizations) as well as references to the loaded model files and paths. Context is a thread-local object.

sdkit.Context

A thread-local container of data. Context is an instance of threading.local, so you can set custom attributes like context.foo = 2 at any time, to track temporary data in a thread-safe manner. This is especially important when using one thread per GPU.

class Context:
  models: dict = {} # model_type to model object. e.g. 'stable-diffusion': loaded_model_in_memory
  model_paths: dict = {} # required. model_type to the path to the model file. e.g. 'stable-diffusion': 'D:\\path\\to\\model.ckpt'
  model_configs: dict = {} # optional. model_type to the path to the config file, for custom models. e.g. 'stable-diffusion': 'D:\\pony_diffusion.yaml'

  device: str = 'cuda' # 'cuda' or 'cuda:0', or any 'cuda:N', or 'cpu'
  device_name: str = None # optional
  half_precision: bool = True
  vram_optimizations: set = {} # 'KEEP_FS_AND_CS_IN_CPU', 'SET_ATTENTION_STEP_TO_4' or 'KEEP_ENTIRE_MODEL_IN_CPU'

sdkit.models

Methods for loading/unloading models from memory, scanning models, as well as downloading/resolving known models from the models db.

load_model(context: Context, model_type: str, **kwargs)
unload_model(context: Context, model_type: str, **kwargs)
download_model(model_type: str, model_id: str, download_base_dir: str=None, subdir_for_model_type=True)
download_models(models: dict, download_base_dir: str=None, subdir_for_model_type=True)
resolve_downloaded_model_path(model_type: str, model_id: str, download_base_dir: str=None, subdir_for_model_type=True)
get_model_info_from_db(quick_hash=None, model_type=None, model_id=None)
scan_model(file_path)

Supported values for model_type are stable-diffusion, vae, hypernetwork, gfpgan, realesrgan.

If the model_type is stable-diffusion, then load_model() accepts an additional scan_model: bool argument. You can set it to False to skip scanning the stable diffusion model (for malicious content) while loading, to save time.

sdkit.generate

Methods for generating content using Stable Diffusion. Please ensure that the stable-diffusion model is loaded into memory before calling these methods.

generate_images(
    context: Context,
    prompt: str = "",
    negative_prompt: str = "",

    seed: int = 42,
    width: int = 512,
    height: int = 512,

    num_outputs: int = 1,
    num_inference_steps: int = 25,
    guidance_scale: float = 7.5,

    init_image = None, # string (path to file), or PIL.Image or a base64-encoded string
    init_image_mask = None, # string (path to file), or PIL.Image or a base64-encoded string
    prompt_strength: float = 0.8,
    preserve_init_image_color_profile = False,

    sampler_name: str = "euler_a", # "ddim", "plms", "heun", "euler", "euler_a", "dpm2", "dpm2_a", "lms",
                                   # "dpm_solver_stability", "dpmpp_2s_a", "dpmpp_2m", "dpmpp_sde", "dpm_fast"
                                   # "dpm_adaptive", "unipc_snr", "unipc_tu", "unipc_tq", "unipc_snr_2", "unipc_tu_2"
    hypernetwork_strength: float = 0,

    callback=None, # callback(latent_samples: Tensor, step_index: int)
)

Supported samplers (19): ddim, plms, heun, euler, euler_a, dpm2, dpm2_a, lms, dpm_solver_stability, dpmpp_2s_a, dpmpp_2m, dpmpp_sde, dpm_fast, dpm_adaptive, unipc_snr, unipc_tu, unipc_tq, unipc_snr_2, unipc_tu_2.

Note: img2img only supports DDIM. We're looking for code contributions to allow other samplers with img2img.

sdkit.filter

Methods for applying filters to images, like face restoration, and upscaling. Please ensure that the corresponding model is loaded into memory before calling that filter. E.g. the gfpgan model needs to be loaded with load_model(context, 'gfpgan') before calling apply_filter(context, 'gfpgan', img).

apply_filters(context: Context, filters, images, **kwargs)

sdkit.train

Methods for merging models. We're looking for code contributions to add training methods to this module.

merge_models(model0_path: str, model1_path: str, ratio: float, out_path: str, use_fp16=True)

sdkit.utils

log() # a basic logger, tracks INFO, milliseconds, and thread name

load_tensor_file(path)
save_tensor_file(data, path)
save_images(images: list, dir_path: str, file_name='image', output_format='JPEG', output_quality=75)
save_dicts(entries: list, dir_path: str, file_name='data', output_format='txt')

hash_bytes_quick(bytes)
hash_file_quick(model_path)
hash_url_quick(model_url)

img_to_base64_str(img, output_format="PNG", output_quality=75)
img_to_buffer(img, output_format="PNG", output_quality=75)
buffer_to_base64_str(buffered, output_format="PNG")
base64_str_to_buffer(img_str)
base64_str_to_img(img_str)
resize_img(img: Image, desired_width, desired_height)
apply_color_profile(orig_image: Image, image_to_modify: Image)

img_to_tensor(img: Image, batch_size, device, half_precision: bool, shift_range=False, unsqueeze=False)
get_image_latent_and_mask(context: Context, image: Image, mask: Image, desired_width, desired_height, batch_size)
latent_samples_to_images(context: Context, samples)

gc() # calls CPU-based GC, as well as torch GC

download_file(url: str, out_path: str) # Downloads large files (without storing them in memory), resumes incomplete downloads, shows progress bar

Models DB

Click here to see the list of known models.

sdkit includes a database of known models and their configurations. This lets you download a known model with a single line of code. You can customize where it saves the downloaded model.

Additionally, sdkit will attempt to automatically determine the configuration for a given model (when loading from disk). For e.g. if an SD 2.1 model is being loaded, sdkit will automatically know to use fp32 for attn_precision. If an SD 2.0 v-type model is being loaded, sdkit will automatically know to use the v2-inference-v.yaml configuration. It does this by matching the quick-hash of the given model file, with the list of known quick-hashes.

For models that don't match a known hash (e.g. custom models), or to override the config file, you can set the path to the config file in context.model_paths. e.g. context.model_paths['stable-diffusion'] = 'path/to/config.yaml'

Clone this wiki locally