What do you mostly use ollama for?

catty@lemmy.world · 2 days ago

What do you mostly use ollama for?

seathru@lemmy.sdf.org · 2 days ago

I currently don’t. But I am ollama-curious. I would like to feed it a bunch of technical manuals and then be able to ask it to recite specs or procedures (with optional links to it’s source info for sanity checking). Is this where I need to be looking/learning?

brendansimms@lemmy.world · 1 day ago

you might want to look into RAG and ‘long-term memory’ concepts. I’ve been playing around with creating a self-hosted LLM that has long-term memory (using pre-trained models), which is essentially the same thing as you’re describing. Also - GPU matters. I’m using an RTX 4070 and it’s noticeably slower than something like in-browser chatgpt, but I know 4070 is kinda pricey so many home users might have earlier/slower gpu’s.

Styxia@lemmy.world · 17 hours ago

How have you been making those models? I have a 4070 and doing it locally has been a dependency hellscape, I’ve been tempted to rent cloud GPU time just to save the hassle.

brendansimms@lemmy.world · 5 hours ago

I’m downloading pre-trained models. I had a bunch of dependency issues getting text-generation-webui to work and honestly I probably installed some useless crap in the process, but I did get it to work. LM Studio is much simpler, but less customization(or I just don’t know how to use it all in lm studio). But yea, I’m just downloading pre-trained models and running them in these UI’s (right now I just loaded up ‘deepseek-r1-distill-qwen-7b’ in LM Studio). I also have the nvidia app installed and I make sure my gpu drivers are always up to date.

Styxia@lemmy.world · 17 hours ago

How have you been making those models? I have a 4070 and doing it locally has been a dependency hellscape, I’ve been tempted to rent cloud GPU time just to save the hassle.