BREAKING NEWS: Google just re-entered the game 🔥🔥 They want to take the crown 👑 back from Chinese open source AI. And... Gemma 4 is FINALLY Apache 2.0 aka real-open-source-licensed. From what I've seen it's going to be a pretty significant model. But give it a try yourself today: brew upgrade llama.cpp # you might need to install from source until build 8637 is in your package manager later today: brew install llama.cpp --HEAD 🔴 My personal recommendation: if you have at least 24GB of RAM or VRAM, run the (very good) 26B MOE: llama-server -hf ggml-org/gemma-4-26B-A4B-it-GGUF:Q4_K_M if you have 16GB of RAM or VRAM, run the dense E4B: llama-server -hf ggml-org/gemma-4-E4B-it-GGUF:Q8_0
M3 max with 128GB of ram to try this out. Let’s see how hot my Mac gets.
Sees RTX 3090… looks at his RTX 3080… does the Vader “Noooooooo!” scream in his office. So close. I’m waiting for the quants.
Will try Gemma 4 E4B with Turbo Quant
26B MOE SUATMM 😍
I'll catch up
Will definitely give it a try.
brew install llama.cpp --HEAD if you want to build from source locally!