aboutsummaryrefslogtreecommitdiffstats
path: root/modules/sd_hijack_optimizations.py
AgeCommit message (Collapse)AuthorLines
2023-06-07Merge pull request #11066 from aljungberg/patch-1AUTOMATIC1111-1/+1
Fix upcast attention dtype error.
2023-06-06Fix upcast attention dtype error.Alexander Ljungberg-1/+1
Without this fix, enabling the "Upcast cross attention layer to float32" option while also using `--opt-sdp-attention` breaks generation with an error: ``` File "/ext3/automatic1111/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 612, in sdp_attnblock_forward out = torch.nn.functional.scaled_dot_product_attention(q, k, v, dropout_p=0.0, is_causal=False) RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead. ``` The fix is to make sure to upcast the value tensor too.
2023-06-04Merge pull request #10990 from vkage/sd_hijack_optimizations_bugfixAUTOMATIC1111-1/+1
torch.cuda.is_available() check for SdOptimizationXformers
2023-06-04fix the broken line for #10990AUTOMATIC-1/+1
2023-06-03torch.cuda.is_available() check for SdOptimizationXformersVivek K. Vasishtha-1/+1
2023-06-01revert default cross attention optimization to DoggettxAUTOMATIC-3/+3
make --disable-opt-split-attention command line option work again
2023-06-01revert default cross attention optimization to DoggettxAUTOMATIC-3/+3
make --disable-opt-split-attention command line option work again
2023-05-31rename print_error to report, use it with together with package nameAUTOMATIC-2/+1
2023-05-29Add & use modules.errors.print_error where currently printing exception info ↵Aarni Koskela-4/+2
by hand
2023-05-21Add a couple `from __future__ import annotations`es for Py3.9 compatAarni Koskela-0/+1
2023-05-19Apply suggestions from code reviewAUTOMATIC1111-38/+28
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-05-19fix linter issuesAUTOMATIC-1/+1
2023-05-18make it possible for scripts to add cross attention optimizationsAUTOMATIC-3/+132
add UI selection for cross attention optimization
2023-05-11Autofix Ruff W (not W605) (mostly whitespace)Aarni Koskela-16/+16
2023-05-10ruff auto fixesAUTOMATIC-7/+7
2023-05-10autofixes from ruffAUTOMATIC-1/+0
2023-05-08Fix for Unet NaNsbrkirch-0/+3
2023-03-24Update sd_hijack_optimizations.pyFNSpd-1/+1
2023-03-21Update sd_hijack_optimizations.pyFNSpd-1/+1
2023-03-10sdp_attnblock_forward hijackPam-0/+24
2023-03-10argument to disable memory efficient for sdpPam-0/+4
2023-03-07scaled dot product attentionPam-0/+42
2023-01-25Add UI setting for upcasting attention to float32brkirch-60/+99
Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
2023-01-23better support for xformers flash attention on older versions of torchAUTOMATIC-24/+18
2023-01-21add --xformers-flash-attention option & implTakuma Mori-2/+24
2023-01-21extra networks UIAUTOMATIC-5/+5
rework of hypernets: rather than via settings, hypernets are added directly to prompt as <hypernet:name:weight>
2023-01-06Added licensebrkirch-0/+1
2023-01-06Change sub-quad chunk threshold to use percentagebrkirch-9/+9
2023-01-06Add Birch-san's sub-quadratic attention implementationbrkirch-25/+99
2022-12-20Use other MPS optimization for large q.shape[0] * q.shape[1]brkirch-4/+6
Check if q.shape[0] * q.shape[1] is 2**18 or larger and use the lower memory usage MPS optimization if it is. This should prevent most crashes that were occurring at certain resolutions (e.g. 1024x1024, 2048x512, 512x2048). Also included is a change to check slice_size and prevent it from being divisible by 4096 which also results in a crash. Otherwise a crash can occur at 1024x512 or 512x1024 resolution.
2022-12-10cleanup some unneeded imports for hijack filesAUTOMATIC-3/+0
2022-12-10do not replace entire unet for the resolution hackAUTOMATIC-28/+0
2022-11-23Patch UNet Forward to support resolutions that are not multiples of 64Billy Cao-0/+31
Also modifed the UI to no longer step in 64
2022-10-19Remove wrong self reference in CUDA support for invokeaiCheka-1/+1
2022-10-18Update sd_hijack_optimizations.pyC43H66N12O12S2-0/+3
2022-10-18readd xformers attnblockC43H66N12O12S2-0/+15
2022-10-18delete xformers attnblockC43H66N12O12S2-12/+0
2022-10-11Use apply_hypernetwork functionbrkirch-10/+4
2022-10-11Add InvokeAI and lstein to credits, add back CUDA supportbrkirch-0/+13
2022-10-11Add check for psutilbrkirch-4/+15
2022-10-11Add cross-attention optimization from InvokeAIbrkirch-0/+79
* Add cross-attention optimization from InvokeAI (~30% speed improvement on MPS) * Add command line option for it * Make it default when CUDA is unavailable
2022-10-11rename hypernetwork dir to hypernetworks to prevent clash with an old ↵AUTOMATIC-1/+1
filename that people who use zip instead of git clone will have
2022-10-11fixes related to mergeAUTOMATIC-1/+2
2022-10-11replace duplicate code with a functionAUTOMATIC-29/+15
2022-10-10remove functorchC43H66N12O12S2-2/+0
2022-10-09Fix VRAM Issue by only loading in hypernetwork when selected in settingsFampai-3/+3
2022-10-08make --force-enable-xformers work without needing --xformersAUTOMATIC-1/+1
2022-10-08add fallback for xformers_attnblock_forwardAUTOMATIC-1/+4
2022-10-08simplify xfrmers options: --xformers to enable and that's itAUTOMATIC-7/+13
2022-10-08emergency fix for xformers (continue + shared)AUTOMATIC-8/+8