Open source and commercially available large-scale language model 'Llama 2' appeared in Replicate, so I tried using it via API



On the site “

Replicate ” where anyone can easily deploy AI models, the high-performance open source AI model “ Llama 2, which was just released on July 18, 2023, appeared , so I tried using it immediately.

Accessing Llama 2 from the command-line with the llm-replicate plugin
https://simonwillison.net/2023/Jul/18/accessing-llama-2/


Information on variations of Llama 2 and performance comparisons with other models are posted in the article below.

Meta releases a commercially available large-scale language model 'Llama 2' for free, and cooperates with Microsoft and Qualcomm to optimize for smartphones and PCs-GIGAZINE



Also, a tool that allows you to try Llama 2 from the browser has appeared immediately as shown below.

``LLaMA2 Chatbot'' that anyone can try Meta's large-scale language model ``Llama 2'' from a browser for free - GIGAZINE



This time, we will use a tool called ' LLM ' when accessing Replicate from the command line. To install this LLM, you need a package manager called 'pip' attached to Python or 'Homebrew', a package manager often used on macOS. In order to use pip this time, first go to the download page of the Python official site and click the 'Download Python 3.XX.X' button.




Download the installer and double click to run it.



Check 'Add python.exe to PATH' and click 'Install Now'.



Wait for a while as the installation starts. Click Close when done to exit the installer.



Open the Start menu, search for 'cmd' and click Command Prompt.



Install the 'LLM' tool with the code below.
[code]pip install llm[/code]



In addition, enter the code below and install the add-on to access Replicate from LLM.
[code]llm install llm-replicate[/code]



Next, go to Replicate

's API Token page to get an API token to access Replicate . You will be asked to authenticate with GitHub, so click 'Sign in with GitHub'.



Check the permissions and click 'Authorize replicate'.



Copy the displayed API token.



When you enter the command 'llm keys set replicate', you will be prompted to enter the key, so paste the API token you copied earlier and press the enter key.



Registered the 13B model of Llama 2 under the name 'llama2'. Preparations are now complete.
[code]llm replicate add a16z-infra/llama13b-v2-chat --chat --alias llama2[/code]



When asking a question to llama2, you can throw it in the following format.
[code]llm -m llama2 'insert question here'[/code]



For example, I entered 'Ten great names for a pet pelican' assuming that 'I want to keep a pelican as a pet but I can't think of a good name ...'. They came up with 10 charming names like 'Peli' and 'Pelty'.



'I can also speak Japanese,' says Llama 2.



However, even if I asked a question in Japanese, there were many cases where the answer was in English from the middle of the answer. It may be that when the tension gets high, the mother tongue comes out ... ....



Although Replicate can be used for free for a certain amount of time, it

costs $ 0.0023 (about 0.32 yen) per second of inference time after a certain period of time. It is not necessary to register a credit card in advance, it is OK if you register after you can not reason. At the time of writing the article, 'to what extent it can be used for free' was not stated.



You can check 'How many seconds each inference took' by opening

the Replicate dashboard .

in Review,   Software,   Web Service, Posted by log1d_ts