Key Highlights –
- Google has released Gemma 4, a family of four open AI models under the Apache 2.0 license. This covers E2B and E4B variants for phones and edge devices & 26B and 31B models for PCs and workstations. All of these are available immediately for commercial and research use.
- The smallest models run offline on hardware constrained as a Raspberry Pi or mobile phone. supporting multimodal tasks including speech, vision, and code generation with low latency and no cloud dependency.
- All four models support up to 256K token context windows, multistep reasoning, and agentic workflows. They support more than 140 languages, the largest models are offered in both speed-focused and quality-focused configurations.
Google released Gemma 4 today, and the highlighting feature is not that it is the largest model in the family, rather the smallest. The E2B and E4B variants are built specifically for edge devices i.e. phones, Raspberry Pi, and other low-power hardware. They run offline, handle speech, vision, and code generation without a cloud connection, and are designed to operate with low latency on hardware that most AI models cannot run at all.
This is a meaningful shift in what “open AI model” means in practice. Most open model releases target developers with access to capable GPUs. Gemma 4’s E2B and E4B push the floor down to consumer hardware and embedded devices. These matter greatly for privacy-sensitive applications, offline environments, and markets where cloud connectivity is unreliable or expensive.
The Full Model Range
The four Gemma 4 sizes cover meaningfully different use cases rather than simply offering more of the same at different parameter counts.
The E2B and E4B are edge-first, multimodal, and offline-capable. The 26B targets personal computers and workstations in a speed-optimised configuration. The 31B is the quality-focused variant for the same hardware range, designed for tasks where accuracy matters more than response time such as complex reasoning, research synthesis, extended document analysis.
All four share the same core capabilities: 256K token context windows, multistep reasoning, agentic workflow support, and 140-plus language coverage. The 256K context window in particular stands out for the larger models, it allows analysis of full codebases or lengthy documents within a single session without the truncation that limits many competing open models.
Apache 2.0 and What It Means
As you may know, the licensing choice matters as much as the capability for open model releases. Apache 2.0 permits commercial use, modification, and distribution without the restrictions that come with more limiting open licences. Developers can build products on top of Gemma 4, fine-tune it for specific domains, and deploy it in commercial applications without licensing fees or usage constraints from Google.
This positions Gemma 4 directly against Meta’s Llama series, which operates under a separate licence with some commercial restrictions, and against Mistral’s open releases. Google is making a deliberate claim on the developer community that builds local-first, privacy-first, or cost-sensitive AI applications.
You can also read – Microsoft’s $10 billion Japan Investment Signals a Push for AI Growth and Cybersecurity
What It Means for Users and the Industry
For individual developers and researchers, Gemma 4 removes the cost barrier for running capable AI models locally. A 26B or 31B parameter model with a 256K context window on a personal workstation was not practically achievable twelve months ago at this quality level. The E2B running on a phone offline opens an entirely different set of use cases. This could be real-time translation, on-device document processing, and voice assistants which function without any internet access.
For the industry, the question is whether Google’s open model strategy meaningfully pressures OpenAI, Anthropic, and others to accelerate their own open releases. Whether the gap between open and closed frontier models is wide enough that Gemma 4 serves a distinct market rather than competing directly with GPT or Claude is to be seen. The 256K context and agentic workflow support suggests Google is not conceding the high-end use cases entirely.
Wrapping Up
Gemma 4 is available now on Hugging Face and Google AI Studio. The Apache 2.0 licence means the barrier to trying it is essentially zero. Whether the E2B’s on-device performance actually holds up in real-world conditions, instead of controlled benchmarks, is the first thing developers will test. Google’s track record with Gemma has been stronger than its critics expected. Gemma 4 will be judged the same way.









