In production on millions of boxes and the payoff is a 25% reduction in machines needed for some inference workloads ...
I ran Google's Gemma AI locally on a cheap mini PC, and it handled more of my everyday ChatGPT tasks than I expected.