Skip to content
Regolo Logo
Benchmarks & Cost Optimization
7 min read

MiniMax M2.7 vs Kimi K2.5: when to use which

Both MiniMax M2.7 and Kimi K2.5 are open-weight Mixture-of-Experts models released in early 2026 that punch well above their cost class. They are not…

Alex Genovese
Read article
Case Studies & Community Stories
5 min read

Using Petri to audit open-source LLMs with Regolo

A practical guide to choosing between Qwen, Mistral, and Gemma based on workload, cost, serving profile, and model family tradeoffs.

Alex Genovese
Read article
Self‑Hosting & DevOps
5 min read

Why TurboQuant matters for real-world LLM inference

TurboQuant is a KV-cache compression method from Google Research that was presented at ICLR 2026. In the reported results, it compresses KV cache values…

Alex Genovese
Read article
Benchmarks & Cost Optimization
9 min read

Sustainable inference is now an AI infrastructure decision

Artificial intelligence is simultaneously our most promising tool for fighting climate change and one of its fastest-growing contributors. As AI adoption accelerates globally, the…

Alex Genovese
Read article
Ready to scale? Get Free Regolo Credits!