'Open Japanese LLM Leaderboard' released to evaluate the performance of large-scale language models used in Japanese AI
The 'Open Japanese LLM Leaderboard' has been released, which evaluates and analyzes the performance of Japanese large-scale language models (LLMs) using more than 16 types of NLP (natural language processing) tasks. The leaderboard was built by the National Institute of Informatics and other cross-organizational research and development project '
Open Japanese LLM Leaderboard - a Hugging Face Space by llm-jp
https://huggingface.co/spaces/llm-jp/open-japanese-llm-leaderboard
Introducing the Open Leaderboard for Japanese LLMs!
https://huggingface.co/blog/leaderboard-japanese
Open Japanese LLM Leaderboard Released - LLM Study Group
https://llm-jp.github.io/llm/2024/11/20/open-japanese-llm-leaderboard.html
While LLMs work widely in English, it has been difficult to know how well they work in other languages. This 'Open Japanese LLM Leaderboard' uses ' llm-jp-eval ', an automatic evaluation tool for Japanese LLMs, to evaluate the performance of LLMs.
llm-jp-eval: An automatic evaluation tool for large-scale Japanese language models
(PDF file) https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A8-2.pdf
The supported evaluation datasets are as follows:
・Natural Language Inference (NLI): Jamp, JaNLI, JNLI, JSeM, JSICK
・Question Answering (QA): JEMHopQA, NIILC
・Reading Comprehension (RC): JSQuAD
・Multiple Choice question answering (MC): JCommonsenseQA
・Entity Linking (EL): chABSA
・Fundamental Analysis (FA): Wikipedia Annotated Corpus
・Mathematical Reasoning (MR): MAWPS
・Semantic Textual Similarity (STS): JSTS
・Machine Translation (Machine Learning: MT): ALT, WikiCorpus
・Exam questions (HE): MMLU, JMMLU
・Code generation (CG): MBPP
・Summary (SUM): XL-Sum
Japanese has a very complex writing system that uses four different characters: hiragana, katakana, kanji, and romaji. Furthermore, it does not use spaces between words, which makes tokenization difficult.
Nevertheless, Japanese language LLMs that incorporate the characteristics of Japanese natural language processing are being developed, but the problem is that there has been no centralized, open system for comparing LLMs.
To increase transparency in research and encourage an open source model development approach, Hugging Face and llm-jp have collaborated to build an open Japanese leaderboard.
It is hoped that this initiative will become a platform for evaluating and strengthening the Japanese Language LLM through collaboration between researchers both in Japan and abroad.
Related Posts:
in Web Service, Posted by logc_nt