Microsoft has introduced the newest version of its Phi-3.5 small language model. These models are designed for high performance and cost-effectiveness, surpassing similar and larger models on various benchmarks in language, reasoning, coding, and mathematics.
According to Tom’s Guide, this new version represents a significant improvement over its predecessor, outperforming smaller models from leading companies such as Google, OpenAI, Mistral, and Meta in several key metrics.
Model Versions and Performance
Phi-3.5 is available in three versions:
- 3.8 billion parameters for basic/fast reasoning
- 4.15 billion parameters for more powerful reasoning
- 41.9 billion parameters for image and video analysis tasks
All three models are free to download and use with a local tool.
The model demonstrated particularly strong performance in reasoning tasks, ranking second only to GPT-4o-mini among leading small models. It also excelled in math tests, significantly outperforming Llama and Gemini.
Real-World Testing and Limitations
The author installed and tested the smaller 3.8 billion parameter version of Phi-3.5 on his laptop. While the model provided detailed answers, its wording often left room for improvement, and it struggled with some simple tasks. For instance, it failed to complete a basic exercise of writing a short one-sentence story where each subsequent word begins with the last letter of the previous one.
NIX Solutions notes that the author did not test the larger version of Phi-3.5, which is rumored to address some of the smaller version’s shortcomings. Benchmark results suggest that its performance may be comparable to OpenAI’s GPT-4o-mini, available in the free version of ChatGPT.
One of Phi-3.5’s strengths lies in its performance on complex tasks across various languages, particularly in STEM and social sciences.
We’ll keep you updated on any further developments or improvements to the Phi-3.5 model as they become available.