Shisa.AI presents Japan's top open-source bilingual (JA/EN) LLM, Shisa V2 405B. Based on Llama 3.1 405B, it rivals GPT-4o & DeepSeek-V3 on Japanese tasks. Releases include the model, dataset, and a chat demo.
With so many languages worldwide, fine-tuning models to achieve SOTA performance in specific languages is becoming incredibly important for truly global AI. Shisa.AI's latest work with their Shisa V2 405B model is a powerful example of this, especially for Japanese.
Shisa.AI has developed this new open-source model, built on Llama 3.1 405B, and it's delivering impressive results. They report it not only surpasses previous GPT-4 versions in their Japanese/English evaluations but also competes head-to-head with the latest models like GPT-4o and DeepSeek-V3 on Japanese benchmarks. A key to this success, they emphasize, was high-quality data.
Beyond the massive 405B model itself, Shisa.AI has also open-sourced their core Shisa V2 JA/EN synthetic dataset, which they believe can boost Japanese capabilities in almost any base model. You can download the model and dataset, and even chat with an FP8 version now.