Gemma 4 and Phi-4: Google and Microsoft in Open Source
When talking about open source AI, the conversation usually revolves around Meta, DeepSeek, and Alibaba. But Google and Microsoft also have serious bets in this field — and they're reaping impressive results.
Gemma 4 and Phi-4 represent opposite philosophies: one bets on multiple sizes with a focus on efficiency; the other proves that small models can outperform giants in specific tasks.
Gemma 4: Google Gets in Shape
Launched on April 2, 2026, Gemma 4 was released by Google DeepMind with a family of four sizes. The highlight is the 31 billion parameter model, which achieved the 3rd position globally in Chatbot Arena — the most respected leaderboard for human preference.
The numbers are impressive for a medium-sized model:
- AIME 2026 (competitive mathematics): 89.2% — frontier-level result
- GPQA Diamond (doctoral-level scientific reasoning): 84.3%
- LiveCodeBench v6 (real coding): 80.0%
The 26 billion model occupies the 6th position on the same leaderboard. Two models from the same family in the top 10 worldwide is a rarely seen achievement.
What Makes Gemma 4 Different?
Gemma 4 was trained with techniques derived from Gemini — Google's cutting-edge proprietary model. The knowledge transfer between generations is clear in the results.
The license is Apache 2.0 — fully permissive for commercial use, without restrictions. It's available via Hugging Face, Vertex AI, and Google AI Studio.
For teams that need a robust, auditable model running on their own infrastructure, Gemma 4 31B is today one of the strongest options available.
Phi-4: Small, Smart, Surprising
Microsoft went in the opposite direction. Phi-4 has only 14 billion parameters — a size that easily fits on a single consumer GPU.
The project's premise is clear: data quality surpasses parameter quantity.
Phi-4 was trained with high-quality synthetic data, filtered academic content, and curated datasets. The result is a model that performs above expectations for its size:
- Complex logical reasoning: competitive with 70B+ models on specific benchmarks
- Context window: 16K tokens (Phi-4-mini, a smaller 3.8B variant, offers 128K)
- Memory usage: efficient for its size, viable on consumer hardware
For those who need to run a model locally — on a cutting-edge laptop or mid-range server — Phi-4 is one of the few options that delivers quality reasoning without requiring heavy infrastructure.
Two Models, Two Use Cases
Gemma 4 is for those who need maximum performance in controlled corporate environments with medium-sized models. Phi-4 is for those who need solid reasoning on limited hardware — edge computing, local devices, embedded applications.
Together, they show that Google and Microsoft take open source seriously not as charity, but as strategy: developers who adopt these models tend to run workloads on the respective companies' clouds.
Conclusion
Gemma 4 and Phi-4 prove that size isn't everything. With the right training techniques and quality data, medium and small-sized models can compete with giants.
For solution architects who need to balance cost, privacy, and performance, these two families deliver concrete and viable options.
Sources:
- Google DeepMind — Gemma 4
- Google Blog — Gemma 4 Launch
- Microsoft — Introducing Phi-4
- Hugging Face — microsoft/phi-4
- BenchLM — Open Source LLM Rankings 2026
Published on Hive.blog | #ArtificialInteligence #llm