[v0.16.0] Qwen3 30B A3B Q40 on 4 x Raspberry Pi 5 8GB · b4rtaz/distributed-llama · Discussion #255
Device: 4 x Raspberry Pi 5 8GB
Distributed Llama version: 0.16.0
Model: qwen3_30b_a3b_q40b4rtaz@raspberrypi2:~/distributed-llama $ ./dllama inference --prompt "<|im_start|>user
Please explain me where is Poland as I have 1 year<|im_end|>
<|im_start|>assistant
" --steps 128 --model models/qwen3_30b_a3b_q40/dllama_model_qwen3_30b_a3b_q40.m --tokenizer models/qwen3_30b_a3b_q40/dllama_tokenizer_qwen3_30b_a3b_q40.t --buffer-float-type q80 --nthreads 4 --max-seq-len 4096 --workers 10.0.0.1:9999 10.0.0...
Read more at github.com