aboutsummaryrefslogtreecommitdiffstats
path: root/modules/devices.py
Commit message (Collapse)AuthorAgeFilesLines
...
* | clarify the option to disable NaN check.AUTOMATIC2023-01-271-0/+2
| |
* | remove the need to place configs near modelsAUTOMATIC2023-01-271-4/+8
|/
* Add UI setting for upcasting attention to float32brkirch2023-01-251-1/+5
| | | | | | Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
* Add option for float32 sampling with float16 UNetbrkirch2023-01-251-0/+2
| | | | This also handles type casting so that ROCm and MPS torch devices work correctly without --no-half. One cast is required for deepbooru in deepbooru_model.py, some explicit casting is required for img2img and inpainting. depth_model can't be converted to float16 or it won't work correctly on some systems (it's known to have issues on MPS) so in sd_models.py model.depth_model is removed for model.half().
* Merge pull request #6922 from brkirch/cumsum-fixAUTOMATIC11112023-01-191-4/+7
|\ | | | | Improve cumsum fix for MPS
| * Fix cumsum for MPS in newer torchbrkirch2023-01-181-4/+7
| | | | | | | | The prior fix assumed that testing int16 was enough to determine if a fix is needed, but a recent fix for cumsum has int16 working but not bool.
* | disable the new NaN check for the CIAUTOMATIC2023-01-171-0/+3
| |
* | Add a check and explanation for tensor with all NaNs.AUTOMATIC2023-01-161-0/+28
|/
* Add support for PyTorch nightly and local buildsbrkirch2023-01-061-5/+23
|
* Add numpy fix for MPS on PyTorch 1.12.1brkirch2022-12-171-0/+9
| | | | | | | When saving training results with torch.save(), an exception is thrown: "RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead." So for MPS, check if Tensor.requires_grad and detach() if necessary.
* add built-in extension systemAUTOMATIC2022-12-031-1/+10
| | | | | add support for adding upscalers in extensions move LDSR, ScuNET and SwinIR to built-in extensions
* add comment for #4407 and remove seemingly unnecessary cudnn.enabledAUTOMATIC2022-12-031-1/+3
|
* fix #4407 breaking UI entirely for card other than ones related to the PRAUTOMATIC2022-12-031-4/+2
|
* Merge pull request #4407 from yoinked-h/patch-1AUTOMATIC11112022-12-031-0/+7
|\ | | | | Fix issue with 16xx cards
| * actual better fixpepe10-gpu2022-11-081-5/+2
| | | | | | thanks C43H66N12O12S2
| * terrible hackpepe10-gpu2022-11-081-2/+9
| |
| * 16xx card fixpepe10-gpu2022-11-071-0/+3
| | | | | | cudnn
* | Rework MPS randn fix, add randn_like fixbrkirch2022-11-301-12/+3
| | | | | | | | torch.manual_seed() already sets a CPU generator, so there is no reason to create a CPU generator manually. torch.randn_like also needs a MPS fix for k-diffusion, but a torch hijack with randn_like already exists so it can also be used for that.
* | Merge pull request #4918 from brkirch/pytorch-fixesAUTOMATIC11112022-11-271-7/+24
|\ \ | | | | | | Fixes for PyTorch 1.12.1 when using MPS
| * | Add fixes for PyTorch 1.12.1brkirch2022-11-211-1/+27
| | | | | | | | | | | | | | | | | | | | | | | | Fix typo "MasOS" -> "macOS" If MPS is available and PyTorch is an earlier version than 1.13: * Monkey patch torch.Tensor.to to ensure all tensors sent to MPS are contiguous * Monkey patch torch.nn.functional.layer_norm to ensure input tensor is contiguous (required for this program to work with MPS on unmodified PyTorch 1.12.1)
| * | Revert "MPS Upscalers Fix"brkirch2022-11-171-9/+0
| | | | | | | | | | | | This reverts commit 768b95394a8500da639b947508f78296524f1836.
* | | eliminate duplicated code from #5095AUTOMATIC2022-11-271-19/+11
| | |
* | | torch.cuda.empty_cache() defaults to cuda:0 device unless explicitly set ↵Matthew McGoogan2022-11-261-2/+12
|/ / | | | | | | otherwise first. Updating torch_gc() to use the device set by --device-id if specified to avoid OOM edge cases on multi-GPU systems.
* | change formatting to match the main program in devices.pyAUTOMATIC2022-11-121-5/+16
| |
* | Update devices.py源文雨2022-11-121-1/+1
| |
* | Fix wrong mps selection below MasOS 12.3源文雨2022-11-121-3/+10
|/
* MPS Upscalers Fixbrkirch2022-10-251-0/+4
| | | | Get ESRGAN, SCUNet, and SwinIR working correctly on MPS by ensuring memory is contiguous for tensor views before sending to MPS device.
* Remove BSRGAN from --use-cpu, add SwinIRbrkirch2022-10-251-1/+1
|
* remove parsing command line from devices.pyAUTOMATIC2022-10-221-9/+5
|
* implement CUDA device selection by IDExtraltodeus2022-10-211-3/+18
|
* Add 'interrogate' and 'all' choices to --use-cpubrkirch2022-10-141-1/+1
| | | | | * Add 'interrogate' and 'all' choices to --use-cpu * Change type for --use-cpu argument to str.lower, so that choices are case insensitive
* --no-half-vaeAUTOMATIC2022-10-101-1/+5
|
* Merge branch 'master' into cpu-cmdline-optbrkirch2022-10-041-0/+10
|\
| * send all three of GFPGAN's and codeformer's models to CPU memory instead of ↵AUTOMATIC2022-10-041-0/+10
| | | | | | | | just one for #1283
* | Add BSRGAN to --add-cpubrkirch2022-10-041-1/+1
| |
* | Add --use-cpu command line optionbrkirch2022-10-041-3/+2
| | | | | | | | Remove MPS detection to use CPU for GFPGAN / CodeFormer and add a --use-cpu command line option.
* | Merge branch 'master' into masterbrkirch2022-10-041-2/+1
|\|
| * initial support for training textual inversionAUTOMATIC2022-10-021-2/+1
| |
* | When device is MPS, use CPU for GFPGAN insteadbrkirch2022-10-011-1/+1
|/ | | | GFPGAN will not work if the device is MPS, so default to CPU instead.
* first attempt to produce crrect seeds in batchAUTOMATIC2022-09-131-0/+10
|
* changes for #294AUTOMATIC2022-09-121-0/+17
|
* Allow TF32 in CUDA for increased performance #279AUTOMATIC2022-09-121-0/+11
|
* add half() supporrt for CLIP interrogationAUTOMATIC2022-09-111-0/+6
|
* CLIP interrogatorAUTOMATIC2022-09-111-6/+10
|
* Modular device managementAbdullah Barhoum2022-09-111-0/+12