Skip to content
Regolo Logo

diffusiongemma-26B-A4B-it

diffusiongemma‑26B‑A4B‑it is a 26B‑parameter (4B active) Gemma 4 MoE block‑diffusion model that generates 256‑token chunks via iterative denoising, delivering up to ~6× higher per‑request throughput than autoregressive Gemma 4 26B‑A4B with a 256K multimodal context window under Apache 2.0.
Core Model
Chat

Getting Started

Step 1

Sign Up and get your Api Key and use with UNLIMITED tokens for 30 days.

Step 2

Paste the URL from Huggingface repository: https://huggingface.co/google/diffusiongemma-26B-A4B-it

Step 3

Choose the GPU machine to deploy.

That’s all! You’re ready to use the model in few minutes without infrastructure complexity in few minutes.


Additional Information

Credits to Google


Applications & Use Cases

  • High‑throughput content generation services (summaries, drafts, marketing copy, documentation) where batched requests and total completion time matter more than streaming the first token ASAP.
  • Long‑context multimodal analysis over documents plus images (and short videos via frames) using the 256K context window and thinking mode for large‑batch offline processing.
  • Function‑calling and agent backends that execute multi‑step reasoning in a single large response, benefiting from the MoE efficiency (4B active) and block‑wise decoding.
  • Cost‑sensitive deployments that want near‑Gemma‑4‑31B quality with better GPU utilization and higher tokens‑per‑second per request on a single H100 or similar GPU.