Version 0.7 of ``llamafile'', which distributes and executes large-scale language models in a single file, increases processing power by up to 10 times
![](https://i.gzn.jp/img/2024/04/02/llamafile-10x-faster-prompt-evaluation-times/00_m.png)
The package `` llamafile v0.7 '', which allows you to easily distribute and run a large-scale language model (LLM) with a single executable file of only about 4GB, has been released. This version improves the calculation performance and accuracy of both the CPU and GPU, and supports
Release llamafile v0.7 · Mozilla-Ocho/llamafile · GitHub
https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.7
![](https://i.gzn.jp/img/2024/04/02/llamafile-10x-faster-prompt-evaluation-times/github.png)
Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times For AMD Zen 4 - Phoronix
![](https://i.gzn.jp/img/2024/04/02/llamafile-10x-faster-prompt-evaluation-times/phoronix.png)
LLaMA Now Goes Faster on CPUs
llamafile is a mechanism that allows developers and end users to easily distribute and use LLM by providing it as a single file that can be executed on most systems.
How to easily run ``llamafile'', a mechanism that allows you to easily distribute and execute AI using large-scale language models with just one executable file of only 4 GB, on Windows and Linux - GIGAZINE
![](https://i.gzn.jp/img/2023/12/10/llamafile/00_m.png)
It has been reported that 'llamafile v0.7' released on March 31, 2024 local time has significantly improved prompt processing speed on the CPU.
Engineer Justin Tunney ran ``llamafile v0.7'', ``llamafile v0.6.2'', and ``llama.ccp 2024-03-26'', an acceleration tool that is also included in llamafile, and observed the difference in processing speed. is shown.
Below are the execution results for a 2020 HP terminal (equipped with Intel Core i9-9900) that Mr. Tanney had on hand. After running with different models and parameters, we can see that llamafile v0.7 showed superior results.
![](https://i.gzn.jp/img/2024/04/02/llamafile-10x-faster-prompt-evaluation-times/fig1_m.png)
Mr. Tunney also shows the results of running on Raspberry Pi v5 (ARMv8.2) and Raspberry Pi v4 (ARMv8.0). Raspberry Pi v5 has a difference of up to 8 times compared to the previous version.
According to the release, llamafile v0.7 supports Intel's instruction architecture ``AVX-512'', which means processing speed will be 10 times faster in environments such as Zen4 architecture.
Related Posts:
in Software, Posted by logc_nt