New ask Hacker News story: Ask HN: Build spec for home LLM box?

Ask HN: Build spec for home LLM box?
2 by ActorNightly | 0 comments on Hacker News.
Ive been out of the loop a bit on running models at home (with Ollama and such). I want to build an air cooled home server to run the bigger parameter models (like llama3 70b which is 40gb with quantization to 4bits). It seems like running 2 3090s or 4090s is the way to go for this. 1) Does Ollama support loading the model across multiple gpus? 2) Anyone have a general parts list that I can copy that works well? Id prefer to go with 3 gpus but I feel like cooling may be an issue.

Comments