Can I run this LLM? — fit by model × hardware

✅ Pick a combination

Computed with the open FitLLM engine — accurate per-layer KV-cache modeling, not a naive estimate. Updated 2026-07-16.

Each page is computed by the open FitLLM engine from official model configs.

Guides

Why naive VRAM calculators are wrong

Reference

LLM memory & architecture reference

What can I run on my hardware?

What can I run on RTX 3060 12GB
What can I run on RTX 4080 SUPER
What can I run on RTX 5080
What can I run on RTX 3090
What can I run on RTX 4090
What can I run on RTX 5090
What can I run on M5 Pro 48GB
What can I run on M5 Max 64GB
What can I run on M5 Max 128GB

Best GPU / Mac by model

Best GPU or Mac for Hy3
Best GPU or Mac for GLM-5.2
Best GPU or Mac for GLM-4.7-Flash
Best GPU or Mac for gpt-oss-20b
Best GPU or Mac for gpt-oss-120b
Best GPU or Mac for Qwen 3.6 35B-A3B
Best GPU or Mac for Qwen 3.6 27B
Best GPU or Mac for Qwen-AgentWorld-35B-A3B
Best GPU or Mac for Gemma 4 31b
Best GPU or Mac for Gemma 4 26b A4B
Best GPU or Mac for Gemma 4 12b
Best GPU or Mac for Llama-3.1-8B-Instruct
Best GPU or Mac for Llama-3.2-3B-Instruct
Best GPU or Mac for MiniCPM5-1B

GPU vs GPU

RTX 4090 vs RTX 3090
RTX 5090 vs RTX 4090
RTX 5080 vs RTX 4080 SUPER
RTX 5090 vs RTX 5080
RTX 4090 vs RTX 5080
RTX 3090 vs RTX 4080 SUPER

By model × hardware

hy3 on rtx 3090
glm 5 2 on rtx 3090
glm 4 7 flash on rtx 3090
gpt oss 20b on rtx 3090
gpt oss 120b on rtx 3090
qwen 3 6 35b a3b on rtx 3090
qwen 3 6 27b on rtx 3090
qwen agentworld 35b a3b on rtx 3090
gemma 4 31b on rtx 3090
gemma 4 26b a4b on rtx 3090
gemma 4 12b on rtx 3090
llama 3 1 8b instruct on rtx 3090
llama 3 2 3b instruct on rtx 3090
minicpm5 1b on rtx 3090
hy3 on rtx 4090
glm 5 2 on rtx 4090
glm 4 7 flash on rtx 4090
gpt oss 20b on rtx 4090
gpt oss 120b on rtx 4090
qwen 3 6 35b a3b on rtx 4090
qwen 3 6 27b on rtx 4090
qwen agentworld 35b a3b on rtx 4090
gemma 4 31b on rtx 4090
gemma 4 26b a4b on rtx 4090
gemma 4 12b on rtx 4090
llama 3 1 8b instruct on rtx 4090
llama 3 2 3b instruct on rtx 4090
minicpm5 1b on rtx 4090
hy3 on rtx 4080 super
glm 5 2 on rtx 4080 super
glm 4 7 flash on rtx 4080 super
gpt oss 20b on rtx 4080 super
gpt oss 120b on rtx 4080 super
qwen 3 6 35b a3b on rtx 4080 super
qwen 3 6 27b on rtx 4080 super
qwen agentworld 35b a3b on rtx 4080 super
gemma 4 31b on rtx 4080 super
gemma 4 26b a4b on rtx 4080 super
gemma 4 12b on rtx 4080 super
llama 3 1 8b instruct on rtx 4080 super
llama 3 2 3b instruct on rtx 4080 super
minicpm5 1b on rtx 4080 super
hy3 on rtx 5090
glm 5 2 on rtx 5090
glm 4 7 flash on rtx 5090
gpt oss 20b on rtx 5090
gpt oss 120b on rtx 5090
qwen 3 6 35b a3b on rtx 5090
qwen 3 6 27b on rtx 5090
qwen agentworld 35b a3b on rtx 5090
gemma 4 31b on rtx 5090
gemma 4 26b a4b on rtx 5090
gemma 4 12b on rtx 5090
llama 3 1 8b instruct on rtx 5090
llama 3 2 3b instruct on rtx 5090
minicpm5 1b on rtx 5090
hy3 on rtx 5080
glm 5 2 on rtx 5080
glm 4 7 flash on rtx 5080
gpt oss 20b on rtx 5080
gpt oss 120b on rtx 5080
qwen 3 6 35b a3b on rtx 5080
qwen 3 6 27b on rtx 5080
qwen agentworld 35b a3b on rtx 5080
gemma 4 31b on rtx 5080
gemma 4 26b a4b on rtx 5080
gemma 4 12b on rtx 5080
llama 3 1 8b instruct on rtx 5080
llama 3 2 3b instruct on rtx 5080
minicpm5 1b on rtx 5080
hy3 on rtx 3060 12gb
glm 5 2 on rtx 3060 12gb
glm 4 7 flash on rtx 3060 12gb
gpt oss 20b on rtx 3060 12gb
gpt oss 120b on rtx 3060 12gb
qwen 3 6 35b a3b on rtx 3060 12gb
qwen 3 6 27b on rtx 3060 12gb
qwen agentworld 35b a3b on rtx 3060 12gb
gemma 4 31b on rtx 3060 12gb
gemma 4 26b a4b on rtx 3060 12gb
gemma 4 12b on rtx 3060 12gb
llama 3 1 8b instruct on rtx 3060 12gb
llama 3 2 3b instruct on rtx 3060 12gb
minicpm5 1b on rtx 3060 12gb
hy3 on m5 max 128gb
glm 5 2 on m5 max 128gb
glm 4 7 flash on m5 max 128gb
gpt oss 20b on m5 max 128gb
gpt oss 120b on m5 max 128gb
qwen 3 6 35b a3b on m5 max 128gb
qwen 3 6 27b on m5 max 128gb
qwen agentworld 35b a3b on m5 max 128gb
gemma 4 31b on m5 max 128gb
gemma 4 26b a4b on m5 max 128gb
gemma 4 12b on m5 max 128gb
llama 3 1 8b instruct on m5 max 128gb
llama 3 2 3b instruct on m5 max 128gb
minicpm5 1b on m5 max 128gb
hy3 on m5 max 64gb
glm 5 2 on m5 max 64gb
glm 4 7 flash on m5 max 64gb
gpt oss 20b on m5 max 64gb
gpt oss 120b on m5 max 64gb
qwen 3 6 35b a3b on m5 max 64gb
qwen 3 6 27b on m5 max 64gb
qwen agentworld 35b a3b on m5 max 64gb
gemma 4 31b on m5 max 64gb
gemma 4 26b a4b on m5 max 64gb
gemma 4 12b on m5 max 64gb
llama 3 1 8b instruct on m5 max 64gb
llama 3 2 3b instruct on m5 max 64gb
minicpm5 1b on m5 max 64gb
hy3 on m5 pro 48gb
glm 5 2 on m5 pro 48gb
glm 4 7 flash on m5 pro 48gb
gpt oss 20b on m5 pro 48gb
gpt oss 120b on m5 pro 48gb
qwen 3 6 35b a3b on m5 pro 48gb
qwen 3 6 27b on m5 pro 48gb
qwen agentworld 35b a3b on m5 pro 48gb
gemma 4 31b on m5 pro 48gb
gemma 4 26b a4b on m5 pro 48gb
gemma 4 12b on m5 pro 48gb
llama 3 1 8b instruct on m5 pro 48gb
llama 3 2 3b instruct on m5 pro 48gb
minicpm5 1b on m5 pro 48gb

All numbers are computed by the open-source fitllm-engine (MIT) from official model config.json values — reproduce or audit them yourself. Estimates; real usage varies with runtime (llama.cpp / MLX / Ollama), driver and display. Found a mismatch? Report it. · FitLLM home