Skip Navigation
StableDiffusion @lemmit.online

A1111 GTX1650 Optimization guide (other Nvidia cards too)

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/boudywho on 2023-12-12 18:39:17.


I will be explaining for both OS (Linux/Windows) how to get the fastest generations, I will show some arguments and some tweaks I did to make generations faster. (this is a noob guide)

(it's my first time posting something like this, but I wanted to help some lost users as I was so lost at one point myself)

  • Laptop Specs: -GTX 1650 - Intel core i5 10th Gen - 16gb DDR4 Ram
  • Got on Windows 1.02 It/s (about 30 seconds for a 512x512 image with 25 steps) And on linux 1.22 It/s (about 24 seconds for a 512x512 image with 25 steps)

I won't be explaining how you can install A1111 is there is an already well-explained Guide and I definitely can't make a better one.

  • So I started by playing with the command line arguments, which I found the best for GTX1650 would be: (don't rewrite "set COMMANDLINEARGS=" it's already there.
 undefined
    
set COMMANDLINE_ARGS=--medvram --xformers --precision full --no-half --upcast-sampling


  

But for you RTX users with 8+gb VRAM, you only need --xformers you can test with other arguments too, which can be found here.

and then I added this line right below it, which clears some vram (it helped me in getting less cuda memory errors)

 undefined
    
set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512


  

you can add those lines in webui-user.bat which is found in "stable-diffusion-webui" folder.

  • Then I wondered if Nvidia drivers played a role in making generations faster, so I tried both the latest drivers (which is 546.17 by the time I am writing this) and 531.61, they didn't give me any difference on my GTX 1650 so I stayed on the latest. (may differ depending on your card try both versions and see what's best)
  • Then I installed "Tiled Diffusion" Extention which gave me even faster generations and fewer cuda memory errors!

-So to install it, you must run A1111 first, then click "Extensions" Tab -> Click "Available" -> Search "[TiledDiffusion with Tiled VAE]" -> Click "Install", then go to the installed tab and press apply and restart.

As simple as that. After restarting, you will find 2 new options in your UI, we will only be using "Tiled VAE", now enable it and everything should be adjusted already by default, BUT if you get cuda memory errors, you can decrease both sliders slightly until you stop getting errors, then after adjusting your settings, go to A1111 settings tab and then scroll down till you find "Defaults" tab, update your defaults with the new Tiled VAE settings so you don't have to enable it every time you start A1111.

  • Now to some Windows tweaks

-First I went to settings > System > Display > Graphics > Default Graphics Settings > and disabled hardware accelerated GPU. This gave me slightly better speeds, but you can test with it on and off

-Close all background apps (obviously), you can find hidden apps in the system tray

-Debloated my Nvidia drivers, which you can do through NVCleanInstaller (you can skip this step if it's complicated)

-And lastly disabling "hardware acceleration" in your browser for Firefox (you can also disable on other browsers): Settings > scroll down till you see "performance" > untick "Use recommended performance settings" and then untick "Use hardware acceleration when available" then restart your browser.

Now after all these tweaks, you should be getting around 1 it/s (GTX 1650)

  • If you wanna go even further, you can install Linux. I used PopOS. (You could try Mint, Ubuntu, your choice)

So before you install A1111 on linux make sure you installed Nvidia drivers (it's installed automatically with PopOS, just make sure you updated everything in Pop Store) and run those commands first:

-This will make sure you are on the latest updates: sudo apt update then sudo apt upgrade it will take some time depending on your wifi speed

-Then we need to install TCMalloc which will help reduce CPU usage and faster speeds. Just run this in the terminal

 undefined
    
sudo apt install libgoogle-perftools-dev


  

-Now you are good to go, install A1111 using the same guide I mentioned above

  • Now to launch A1111, open the terminal in "stable-diffusion-webui" folder by simply right-clicking and click "open in terminal".
  • Here is the command line to launch it, with the same command line arguments used in windows
 undefined
    
./webui.sh --medvram --xformers --precision full --no-half --upcast-sampling


  
  • Then install Tiled VAE as I mentioned above.

If everything is done correctly.. you should see speeds around 1.22 it/s (GTX 1650)

I hope this helped you, if you have any suggestions/questions please let me know, I would love to hear from you as I am still learning too :)

0 comments

No comments

Start the conversation!