aboutsummaryrefslogtreecommitdiffstats
path: root/modules/sd_hijack_optimizations.py
Commit message (Collapse)AuthorAgeFilesLines
* Add a couple `from __future__ import annotations`es for Py3.9 compatAarni Koskela2023-05-201-0/+1
|
* Apply suggestions from code reviewAUTOMATIC11112023-05-191-38/+28
| | | Co-authored-by: Aarni Koskela <akx@iki.fi>
* fix linter issuesAUTOMATIC2023-05-181-1/+1
|
* make it possible for scripts to add cross attention optimizationsAUTOMATIC2023-05-181-3/+132
| | | | add UI selection for cross attention optimization
* Autofix Ruff W (not W605) (mostly whitespace)Aarni Koskela2023-05-111-16/+16
|
* ruff auto fixesAUTOMATIC2023-05-101-7/+7
|
* autofixes from ruffAUTOMATIC2023-05-101-1/+0
|
* Fix for Unet NaNsbrkirch2023-05-081-0/+3
|
* Update sd_hijack_optimizations.pyFNSpd2023-03-241-1/+1
|
* Update sd_hijack_optimizations.pyFNSpd2023-03-211-1/+1
|
* sdp_attnblock_forward hijackPam2023-03-101-0/+24
|
* argument to disable memory efficient for sdpPam2023-03-101-0/+4
|
* scaled dot product attentionPam2023-03-061-0/+42
|
* Add UI setting for upcasting attention to float32brkirch2023-01-251-60/+99
| | | | | | Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
* better support for xformers flash attention on older versions of torchAUTOMATIC2023-01-231-24/+18
|
* add --xformers-flash-attention option & implTakuma Mori2023-01-211-2/+24
|
* extra networks UIAUTOMATIC2023-01-211-5/+5
| | | | rework of hypernets: rather than via settings, hypernets are added directly to prompt as <hypernet:name:weight>
* Added licensebrkirch2023-01-061-0/+1
|
* Change sub-quad chunk threshold to use percentagebrkirch2023-01-061-9/+9
|
* Add Birch-san's sub-quadratic attention implementationbrkirch2023-01-061-25/+99
|
* Use other MPS optimization for large q.shape[0] * q.shape[1]brkirch2022-12-211-4/+6
| | | | | | Check if q.shape[0] * q.shape[1] is 2**18 or larger and use the lower memory usage MPS optimization if it is. This should prevent most crashes that were occurring at certain resolutions (e.g. 1024x1024, 2048x512, 512x2048). Also included is a change to check slice_size and prevent it from being divisible by 4096 which also results in a crash. Otherwise a crash can occur at 1024x512 or 512x1024 resolution.
* cleanup some unneeded imports for hijack filesAUTOMATIC2022-12-101-3/+0
|
* do not replace entire unet for the resolution hackAUTOMATIC2022-12-101-28/+0
|
* Patch UNet Forward to support resolutions that are not multiples of 64Billy Cao2022-11-231-0/+31
| | | | Also modifed the UI to no longer step in 64
* Remove wrong self reference in CUDA support for invokeaiCheka2022-10-191-1/+1
|
* Update sd_hijack_optimizations.pyC43H66N12O12S22022-10-181-0/+3
|
* readd xformers attnblockC43H66N12O12S22022-10-181-0/+15
|
* delete xformers attnblockC43H66N12O12S22022-10-181-12/+0
|
* Use apply_hypernetwork functionbrkirch2022-10-111-10/+4
|
* Add InvokeAI and lstein to credits, add back CUDA supportbrkirch2022-10-111-0/+13
|
* Add check for psutilbrkirch2022-10-111-4/+15
|
* Add cross-attention optimization from InvokeAIbrkirch2022-10-111-0/+79
| | | | | | * Add cross-attention optimization from InvokeAI (~30% speed improvement on MPS) * Add command line option for it * Make it default when CUDA is unavailable
* rename hypernetwork dir to hypernetworks to prevent clash with an old ↵AUTOMATIC2022-10-111-1/+1
| | | | filename that people who use zip instead of git clone will have
* fixes related to mergeAUTOMATIC2022-10-111-1/+2
|
* replace duplicate code with a functionAUTOMATIC2022-10-111-29/+15
|
* remove functorchC43H66N12O12S22022-10-101-2/+0
|
* Fix VRAM Issue by only loading in hypernetwork when selected in settingsFampai2022-10-091-3/+3
|
* make --force-enable-xformers work without needing --xformersAUTOMATIC2022-10-081-1/+1
|
* add fallback for xformers_attnblock_forwardAUTOMATIC2022-10-081-1/+4
|
* simplify xfrmers options: --xformers to enable and that's itAUTOMATIC2022-10-081-7/+13
|
* emergency fix for xformers (continue + shared)AUTOMATIC2022-10-081-8/+8
|
* Merge pull request #1851 from C43H66N12O12S2/flashAUTOMATIC11112022-10-081-1/+37
|\ | | | | xformers attention
| * update sd_hijack_opt to respect new env variablesC43H66N12O12S22022-10-081-3/+8
| |
| * Update sd_hijack_optimizations.pyC43H66N12O12S22022-10-081-1/+1
| |
| * add xformers attnblock and hypernetwork supportC43H66N12O12S22022-10-081-2/+18
| |
| * switch to the proper way of calling xformersC43H66N12O12S22022-10-081-25/+3
| |
| * add xformers attentionC43H66N12O12S22022-10-071-1/+38
| |
* | Add hypernetwork support to split cross attention v1brkirch2022-10-081-4/+14
| | | | | | | | | | * Add hypernetwork support to split_cross_attention_forward_v1 * Fix device check in esrgan_model.py to use devices.device_esrgan instead of shared.device
* | added support for hypernetworks (???)AUTOMATIC2022-10-071-2/+15
|/
* Merge branch 'master' into stableJairo Correa2022-10-021-8/+0
|