Mistral released the new generation of its flagship open-source artificial intelligence (AI) model, Mistral Large 2, on Wednesday. The company claims the AI model offers significantly improved capabilities in code generation, mathematics, and reasoning. It also gets support for several new languages as well as advanced function calling capabilities. It is also said that despite being one-third the size of recently released Meta Llama 3.1 405B AI model, Mistral’s flagship large language model (LLM) offers similar performance. Notably, Mistral Large 2 is only available for research and non-commercial usages.
The company announced the AI model in a newsroom post. The Mistral Large 2 comes with 1,28,000 tokens context window, which is similar to Meta’s latest AI offering. Additionally, the flagship Mistral AI model supports several new languages including Arabic, Chinese, French, German, Hindi, Italian, Japanese, Korean, Portuguese, Russian, and Spanish. Alongside, it can also generate code in more than 80 coding languages.
Mistral’s new AI model has a size of 123 billion parameters, and can run on a single node. The company said there were three main focus areas to improve the Large 2 model. First was code generation and the LLM was trained on a large volume of coding data. Second, to improve its reasoning capability and minimise instances of hallucination, the AI firm fine-tuned the model to be more cautious in responses. Finally, the AI model was trained to “acknowledge when it cannot find solutions or does not have sufficient information to provide a confident answer.”
Despite being one-third the size of Llama 3.1 405B, the company claims that its LLM outperforms it. Based on its internal benchmark testing, Mistral said its AI model fared better in code generation and math performance. It also claimed to outperform GPT-4o in Java code generation.
Further, the company claims that the Mistral Large 2 has enhanced function calling and retrieval skills that allows it to power complex business applications. Function calling is a capability of AI models to interact with external tools or functions. This allows them to procure data from various sources and provide more accurate, informative, and efficient responses.
The company has partnered with Google Cloud Platform to bring the Large 2 AI model to Vertex AI via a managed application programming interface (API). It also available on cloud via Azure AI Studio, Amazon Bedrock, and IBM Watsonx. Since it is an open source AI model, interested individuals can also access the LLM via its website under the name mistral-large-2407.
To download the instruct model, users can check its HuggingFace listing. Notably, it is available under the Mistral Research Licence which only allows usage and modification for research and non-commercial usages.
Kareena Kapoor is working with Raazi director Meghna Gulzar for her next film. The project,…
2024-11-09 15:00:03 WEST LAFAYETTE -- Daniel Jacobsen's second game in Purdue basketball's starting lineup lasted…
2024-11-09 14:50:03 Rashida Jones is remembering her late father, famed music producer Quincy Jones, in…
2024-11-09 14:40:03 A silent German expressionist film about vampires accompanied by Radiohead’s music — what…
Let's face it - life can be downright stressful! With everything moving at breakneck speed,…
Apple’s redesigned Mac Mini M4 has ditched the previous M2 machine’s SSD that was soldered…