How to Fine-tune, Quantize, and Run Microsoft phi-1.5
Microsoft released phi-1.5, a new large language model (LLM) with 1.3 billion parameters.
It’s 5.4 times smaller than the smallest Llama 2 model (Llama 2 7B). Yet, according to the evaluation conducted by Microsoft, and published on arXiv, phi-1.5 significantly outperforms Llama 2 on several tasks.
Given its relatively small size and the claimed performance, phi-1.5 is a good candidate LLM for affordable AI.
In this article, we will see what could explain this performance: how the model was trained and what training data has been used. I also show how to fine-tune, quantize, and run the model. I benchmark its memory consumption and inference speed.
This article has been originally published in The Kaitchup, my newsletter.
For more articles like this and support my work, consider subscribing to The Kaitchup:phi-1.5: The Power of Distillation
In the paper describing phi-1.5, Microsoft presents 3 models trained on different datasets:
phi-1.5: They have only released this model (under a permissive license allowing commercial use).
phi-1.5-web: Trained on the same data as phi-1.5 but augmented with heavily curated dataset crawled from the web.
phi-1.5-web-only: Trained only on the heavily curated dataset crawled from the web.
Microsoft didn’t release phi-1.5-web and phi-1.5-web-only. I think they have only trained and evaluated them for constrastive experiments showing that they don’t need to augment the training data used by phi-1.5.
First, let’s have a closer look at the claimed performance of the models:
0 Comments