Nov 26, 2024 09:37:00

'Open Japanese LLM Leaderboard' released to evaluate the performance of large-scale language models used in Japanese AI

The 'Open Japanese LLM Leaderboard' has been released, which evaluates and analyzes the performance of Japanese large-scale language models (LLMs) using more than 16 types of NLP (natural language processing) tasks. The leaderboard was built by the National Institute of Informatics and other cross-organizational research and development project '

LLM-jp ' for Japanese LLMs.

Open Japanese LLM Leaderboard - a Hugging Face Space by llm-jp
https://huggingface.co/spaces/llm-jp/open-japanese-llm-leaderboard

Introducing the Open Leaderboard for Japanese LLMs!
https://huggingface.co/blog/leaderboard-japanese

Open Japanese LLM Leaderboard Released - LLM Study Group
https://llm-jp.github.io/llm/2024/11/20/open-japanese-llm-leaderboard.html

While LLMs work widely in English, it has been difficult to know how well they work in other languages. This 'Open Japanese LLM Leaderboard' uses ' llm-jp-eval ', an automatic evaluation tool for Japanese LLMs, to evaluate the performance of LLMs.

llm-jp-eval: An automatic evaluation tool for large-scale Japanese language models
(PDF file) https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A8-2.pdf

The supported evaluation datasets are as follows:

・Natural Language Inference (NLI): Jamp, JaNLI, JNLI, JSeM, JSICK
・Question Answering (QA): JEMHopQA, NIILC
・Reading Comprehension (RC): JSQuAD
・Multiple Choice question answering (MC): JCommonsenseQA
・Entity Linking (EL): chABSA
・Fundamental Analysis (FA): Wikipedia Annotated Corpus
・Mathematical Reasoning (MR): MAWPS
・Semantic Textual Similarity (STS): JSTS
・Machine Translation (Machine Learning: MT): ALT, WikiCorpus
・Exam questions (HE): MMLU, JMMLU
・Code generation (CG): MBPP
・Summary (SUM): XL-Sum

Japanese has a very complex writing system that uses four different characters: hiragana, katakana, kanji, and romaji. Furthermore, it does not use spaces between words, which makes tokenization difficult.

Nevertheless, Japanese language LLMs that incorporate the characteristics of Japanese natural language processing are being developed, but the problem is that there has been no centralized, open system for comparing LLMs.

To increase transparency in research and encourage an open source model development approach, Hugging Face and llm-jp have collaborated to build an open Japanese leaderboard.

It is hoped that this initiative will become a platform for evaluating and strengthening the Japanese Language LLM through collaboration between researchers both in Japan and abroad.

Related Posts:

Nov 26, 2024 09:37:00 in Web Service, Posted by logc_nt