(if you're here for the bonus question I will give you some pointers - this was generated on my own gaming computer with an RTX 4060 16GB Vram, and the original size at which it was generated is 1024x1024)
Let's be honest, AI is tough to get into for the average person. There's a lot of technical requirements you need in both hardware and software, lots of commands to run, and it can potentially break at any step of the process. And that's just being the user, we're not even talking about doing research or
Originally when I got started with AI image gen it was because I wanted to try something with deepseek. I gave it the following:
- I am on Windows. My hardware is the aforementioned RTX, my gbs of ram, etc. [letting it know context so it doesn't make stuff up to fill in the gaps]
- I might already have some dependencies installed, I'm not sure. So make sure to check for them [avoiding useless steps that would just waste time]
- I want to start image generation on my computer but I have no idea where to start. What do I install? Write a guide that:
-
- Uses the command-line interface as much as possible [just faster to work with this, you can copy and paste]
-
- Orders steps from least effort to most effort [makes sure dependency check and the 'easy' stuff gets out of the way first]
And it worked. Any time I ran into an error in the cmd output I gave the command and entire log to deepseek, told it which step I was on, and it told me what to run and that would usually fix it. If it didn't I sent it the log again until it did. It can install some older stuff since its knowledge cutoff is in august 2024, so you can make it look online for the latest info. Once I got ComfyUI working for example I pasted the startup log which had a warning about an old Torch version. By making deepseek look online we found that for my comfyUI version, I could use a higher version. With web search you don't spend hours opening dozens of tabs just to find which pytorch cuda version is compatible with your interface and then installing, the AI finds it for you
And it works and it took less than an hour.
I have an interface for image gen that works on a virtual python environment so that all dependencies are contained, I have docker desktop to run my LLMs with, everything works great and as far as I'm aware I'm on the latest versions of everything.
-> It's not just for AI, it's just that LLMs are great with stuff that has a lot of content to train on, and tech is up there. You could use this same prompt for linux, for example, and get more people away from windows. Anything a bit techy that you need to troubleshoot or tinker with, AIs like deepseek can provide a guide and you just have to follow it.
When it came to installing image gen it was even able to give me some models to download (though they were a bit outdated) and help me troubleshoot. I didn't even know what a VAE was. It told me where to put the stuff I download and what I needed to get started.
Then from there you learn by yourself. I downloaded better models, some LORAs, now I'm installing comfyUI aside from automatic's (with the same instructions to deepseek) so I can try Flux, which gives amazing results. I know more about how AI works behind the scenes now that I've done this, which was helped by deepseek. It might help that I already had python and knew what it is, as well as knowing what pip is and the other stuff I need to run commands. But if you have any questions, open a second deepseek window, send it the command, and ask "what's pip? what does this do? what's a virtual environment in python?"
Again, the point isn't so much about image gen as it is about getting you through the slog and working on stuff. You can use this to troubleshoot stuff like photoshop, your VPS, your code, and who knows what else. It's easier with open source because the code and documentation is just there for the AI (with web lookup it can even look at the code directly if you send it the page), which I hope and feel like will strengthen open source over proprietary software. the cURL project has already used AI to fix 100 bugs, including one they didn't even know was there.
Oh, I'll update with the answer to the bonus question in some hours haha.
while it can certainly help you do things, it can also lead you down some rabbit holes that can really bork your system up if you blindly run whatever it gives you. it's a pretty mixed bag, but if you're specific enough it generally will not screw you over
Exercise caution and don't hesitate to search up the commands (both on google and in another chat window) before executing.
since this is a virtual environment I'm pretty safe. On VPSes I make sure to double-read the command before running anything, and do diagnostics first, like anything I copy from the internet. llms are very bad at asking questions before jumping to an answer so give it as much context as you can before anything, including OS.
I also try to make sure I can undo whatever it asks me to do if need be, which is actually easier to do when you have a single window with every instruction in it over a dozen tabs with some I haven't opened in over 45 minutes while troubleshooting a minor problem in a bigger problem lol.