GenAi, Deep Learning and Computer Vision@awesomedeeplearning P.236

GenAi, Deep Learning and Computer Vision

Microsoft just casually shared their new Phi-3 LLMs less than a week after Llama 3 release. Based on the benchmarks in technical report (https://arxiv.org/abs/2404.14219), even the smallest Phi-3 model beats Llama 3 8B despite being less than half the size.

Phi-3 has "only" been trained on 5x fewer tokens than Llama 3 (3.3 trillion instead of 15 trillion)

Phi-3-mini less has "only" 3.8 billion parameters, less than half the size of Llama 3 8B.

Despite being small enough to be deployed on a phone (according to report), it matches the performance of the much larger Mixtral 8x7B and GPT-3.5. (Phi-3 mini can be quantized to 4-bits, so it only requires ≈ 1.8GB of memory.)

What is the secret sauce? According to the technical report, it's dataset quality over quantity: "heavily filtered web data and synthetic data".

Next to the 4k context-window version, there's also a phi-3-mini-128K model that supports up to 128k tokens.

Fun fact: Phi-3 uses the same tokenizer with a vocabulary size of 32,064 as Llama 2.

❤4👍2

www.tgoop.com/awesomedeeplearning/236

35.9K viewsArtificial Intelligence, edited Apr 23, 2024 at 16:41

tgoop.com/awesomedeeplearning/236

Create: 2024-04-23
Last Update: 2025-10-15 21:42:03

BY GenAi, Deep Learning and Computer Vision

Share with your friend now:
tgoop.com/awesomedeeplearning/236

Telegram News

Microsoft just casually shared their new Phi-3 LLMs less than a week after Llama 3 release. Based on the benchmarks in technical report (https://arxiv.org/abs/2404.14219)