I want to host some LLM’s locally and use more advanced models. Since new hardware is out of the question, I think I should be able to pull something off buying some yesteryear equipment on ebay etc. Did anybody attempt such a project? Does it scale horizontally? (I.e. can I connext two boxes to overcome single box slowness?)

  • xylogx@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 hours ago

    What size model? I can run 8 billion parameter models on my Geforce 3070 with 8gb of vram. Bigger models need more memory. For $1-2k you can upgrade to a 16 or 32 gb video card. For $3k you can get a Framework Desktop with 128 gb unified memory. For $6k you can get a DGX Spark with a blackwell chip and 128 gb unified memory. Mac mini or Mac studio are also good choices in this price range.

    • Barbecue Cowboy@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 hours ago

      The 64GB Framework Desktop runs just barely over 2k configured minimally, I went that route because I thought it was a better option than the discrete 32GB video card, but there are tradeoffs with compatibility. Something to think about at the 2k, but not quite 3k range though.