IBM’s new artificial intelligence model family focusing on enterprise applications: Granite 3.0

In May this year, IBM announced both open source Granite It focuses on code generation with productive artificial intelligence models and InstructLab his initiative named we shared with you. In a post shared by IBM, the new software released under the Apache 2.0 license Granite 3.0 8B ve Granite 3.0 2B introduced its models.

The company also Granite 3.0 3B A800M Instruct, Granite 3.0 1B A400M Instruct, Granite 3.0 3B A800M Base and Granite 3.0 1B A400M Basecontaining Mix of Experts (MoE) introduced its models. In addition, IBM Granite Guardian 3.0 8B ve Granite Guardian 3.0 2B We should point out that it also has a new group containing models. The group in question stands out with its optimized railing and security options.

Rob Thomas, senior vice president and chief commercial officer, IBMAccording to what was quoted by; As noted on the last earnings call, the company’s business built around generative AI is now more than $2 billion across technology and consulting. Considering his 25 years at IBM, Thomas stated that they have never had a business area that scales at this speed before.

Users can benefit from the Granite 3.0 model family, which focuses on enterprise applications in areas such as customer service, IT automation, Business Process Outsourcing (BPO), application development and cybersecurity.

Training data

The new Granite 3.0 models were trained by the central data model factory team. This team is responsible for sourcing and organizing the data IBM uses for training. The model was trained on data from 12 different natural languages ​​and 116 different programming languages ​​using a new two-stage training method. According to IBM Senior Vice President and Research Director Dario Gil, the training process includes 12 trillion tokens of data.

Benchmarks

According to the information shared by the company, the Granite 3.0 8B Instruct model shows leading performance on average compared to similarly sized open source models from Mistral and Meta in core corporate tasks of RAG, tooling and Cyber ​​Security.

IBM notes that in standard academic benchmarks defined by Hugging Face’s OpenLLM Leaderboard, the overall performance of the Granite 3.0 8B Instruct model leads on average against the state-of-the-art performance of similarly sized open source models from Meta and Mistral. However, we see that Llama 3.2 3B outperforms Granite 3.0 2B, especially in the MMLU, MMLU-Pro and AGI-Eval benchmarks. Likewise, Llama 3.1 8B surpasses Granite-3.0 8B in the MMLU criterion, which we can translate as Mass Multitasking Language Understanding.

In addition, the company states that in the AttaQ security benchmark, the Granite 3.0 8B Instruct model leads in all measured security dimensions compared to the Meta and Mistral models. New IBM language models in the technical document More detailed comparisons are also available. In this sense, it is useful to read the technical documentation before trying the models.

New models from IBM watsonX In addition to the service Amazon Bedrock, Amazon Sagemaker ve Hugging FaceIt will also meet with users in . By the end of the year, the 3.0 8D and 2D language models are expected to include support for an expanded 128K context window and multi-modal document understanding capabilities.

Source link: https://webrazzi.com/2024/10/21/ibm-in-kurumsal-uygulamalara-odaklanan-yeni-yapay-zeka-model-ailesi-granite-3-0/