I want to host some LLM’s locally and use more advanced models. Since new hardware is out of the question, I think I should be able to pull something off buying some yesteryear equipment on ebay etc. Did anybody attempt such a project? Does it scale horizontally? (I.e. can I connext two boxes to overcome single box slowness?)

  • Barbecue Cowboy@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    4 小时前

    With older hardware, once you accumulate enough vram to run it, your problem is going to shift to memory bandwidth and your question is going to shift from ‘Can I run this Model’ to ‘Can I run this Model at an acceptable speed’.