Which specs are as low as reasonable possible for local LLM models? Do you recommend some distro in particular?

SocialistVibes01@lemmy.ml · edit-2 15 hours ago

Which specs are as low as reasonable possible for local LLM models? Do you recommend some distro in particular?

monovergent@lemmy.ml · edit-2 13 hours ago

16 GB VRAM GPU, models stored on SSD, rest of the computer doesn’t have to be crazy. Intel Arc is best bang for the buck at the moment. You can get LLM running on 8 GB cards or even the CPU, but IMO such small models are more novelties than workhorses. I personally use Debian but you’ll be fine as long as your distro’s repo has drivers recent enough for your GPU.

For perspective, I’m using such a build to help with boilerplate code, single-use scripts that I don’t have the patience to trial-and-error (like ones that have to deal with directory structures and special characters), getting an idea of what’s what when decompiling and reverse engineering, brainstorming tip-of-the-tongue ideas, and upscaling images.

thingsiplay@lemmy.ml · 4 hours ago

I’m on the low end with 8gb VRAM, that can partially run on GPU and system RAM. That makes it halway usable. I’m not an Ai guy at all and use it mostly to play around. Occasionally it can be used here and there for simple stuff like as you suggest for brainstorming, to extract text from images or translate them. And I also used it to help with programming here and there asking questions when being offline for a month, help refactor program code and functions just to see what can be done.

For anyone wanting to use it as a main tool and replacement of ChatGPT and the likes, they clearly need stronger hardware. I wish I had 16gb… this is extremely limiting. But token speed is at least often 17 tokens per second and sometimes over 50. That’s about what I can do.