Oct 21, 2023 22:30:00

Official Docker image of 'Ollama', an application that allows you to easily run various chat AIs in the local environment, is now available

The official Docker image of 'Ollama', an application that allows you to easily run open source large-scale language models such as '

Mistral ', ' Llama 2 ', and ' Vicuna ' locally, has been released, so I immediately tried its usability. .

Ollama is now available as an official Docker image · Ollama Blog
https://ollama.ai/blog/ollama-is-now-available-as-an-official-docker-image

Representative examples of large-scale language models that can be run on Ollama are shown below. You can check the entire list on Ollama's official website .

model	parameter	size	Download
Mistral	7B	4.1GB	`ollama run mistral`
Llama 2	7B	3.8GB	`ollama run llama2`
Code Llama	7B	3.8GB	`ollama run codellama`
Llama 2 Uncensored	7B	3.8GB	`ollama run llama2-uncensored`
Llama 2 13B	13B	7.3GB	`ollama run llama2:13b`
Llama 2 70B	70B	39GB	`ollama run llama2:70b`
Orca Mini	3B	1.9GB	`ollama run orca-mini`
Vicuna	7B	3.8GB	`ollama run vicuna`

A model with 3B (3 billion) parameters requires at least 8GB of memory, a 7B (7 billion) model requires 16GB, and a 13B (13 billion) model requires 32GB of memory.

So, let's start setting up the environment to run Ollama. This time we will be using Docker, so install Docker using the method that suits your environment from the link below.

Install Docker Engine | Docker Documentation
https://docs.docker.com/engine/install/

This time, to use Debian, I entered the following command.

[code]sudo apt-get update
sudo apt-get install ca-certificates curl gnupg

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
'deb [arch='$(dpkg --print-architecture)' signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
'$(. /etc/os-release && echo '$VERSION_CODENAME')' stable' | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin[/code]

Once Docker is installed, start Ollama. There is a mode that operates using only the CPU, and a mode that uses NVIDIA's GPU to speed up the operation, but this time we will start in CPU only mode, so execute the command below.
[code]docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama[/code]

After starting the Ollama container, you can operate Ollama using the command below.
[code]docker exec -it ollama [command you want to execute][/code]

This time we will be using the 7B model of Llama 2, so refer to the 'Download' column in the table above and run the command 'docker exec -it ollama ollama run llama2', and the model will be automatically downloaded and executed. Ta. When it becomes possible to input, '>>>' will be displayed, so just enter the chat here. First of all, I greeted them with 'Hello!'.

When I tried Japanese, he said, 'I can communicate in multiple languages!' Regardless of the performance of Llama 2, I found that I could use Japanese without any problems even when interacting with models on Ollama.