The 'Ollama' tool, which allows users to easily run AI on their own PCs, has numerous problems, and some argue that it's better to use llama.cpp.



Software engineer Zetaphor has published a blog post criticizing the development approach and operational policies of 'Ollama,' a tool that allows users to run AI locally.

Friends Don't Let Friends Use Ollama | Sleeping Robots

https://sleepingrobots.com/dreams/stop-using-ollama/

Ollama is a tool that allows you to easily run multiple AIs with a single command. You can see what it's like to actually use it in the article below.

OpenAI's open weight model 'gpt-oss' can be easily used on personal PCs using Ollama - GIGAZINE



Prior to the emergence of Ollama, 'llama.cpp' appeared as a technology to run 'LLaMA,' a large-scale language model that had top-class performance at the time (2023), on relatively low-spec devices.

Meta's 'LLaMA,' a rival to GPT-3, can now run on Macs with the M1 chip, demonstrating that large-scale language models can be run on ordinary consumer hardware - GIGAZINE



Ollama was implemented as a wrapper to make llama.cpp easier to use, but for over a year after its introduction in mid-2023, it hid the fact that it was using llama.cpp internally.

llama.cpp was distributed under the MIT license, with the condition that 'credit must be given.' As long as credit was given, anyone could freely use it, including for commercial purposes, but Ollama failed to even adhere to this one condition.

The community noticed that llama.cpp was being used internally, and on March 17, 2024 , a request for correction was submitted. However, the Ollama administrators did not respond, and in April, when more specific feedback was received, they added a section called 'Supported Backends' and listed 'llama.cpp.' At the same time, one of the administrators stated that they would 'migrate to a different engine in the future.'

Zetaphor expressed his dissatisfaction, saying, 'It seems they don't intend to give llama.cpp any notable credit.' He further quoted an opinion posted on the engineer-focused news site 'Hacker News' that said 'the llama team should be properly recognized,' indicating that several people share this view.



According to Zetaphor, Ollama's problems are not limited to the credit issue, but include numerous other issues, as listed below.

• Proprietary backend is low quality
They developed their own backend and switched from llama.cpp, but apparently the quality was poor and the performance was only about half of what it was with llama.cpp.

- Inaccurate model name
It appears that the company misled users and damaged the model's reputation by running a distillation version with inferior performance under the name 'DeepSeek-R1.'

Mix in closed-source apps
Despite claiming to be open-source software, Ollama developed its desktop version in a private repository and distributed it without a license. Because the download button for the desktop version was placed next to the GitHub link on Ollama's website, users were easily misled into believing it was open source.

- Inconvenient proprietary specifications
While using the GGUF file format, which was developed for its advantage of being 'complete in a single file,' it adds a custom Docker-like configuration file. Furthermore, it has an inconvenient specification where simply changing a parameter copies the entire model, which can be several gigabytes in size.

- The model is slow to start using.
When a new model is introduced, an Ollama representative needs to perform some initial work to run it on Ollama. However, using llama.cpp allows you to run the model the moment it's introduced.

• Transition to the cloud
After gaining popularity by touting the ability to 'run AI locally and ensure privacy,' they introduced a cloud model that sends data externally. Furthermore, they have been criticized for their low security awareness, including neglecting to fix a vulnerability that allowed authentication tokens to be leaked for several months.

Based on these points, Zetaphor argued that 'what you really need when you want to run a large-scale language model locally is llama.cpp, and the Ollama package is unnecessary.'

in AI,   Software, Posted by log1d_ts