IBM’s enterprise-grade AI model Granite 3.0 supports AI Agent application development

IBM is actively developing AI Agent technology to perform complex tasks in a dynamic business environment

In late October, IBM released Granite 3.0, the third generation of its flagship model series; it has surpassed or kept pace with competitor models of similar size in many academic and industry evaluation tests, demonstrating excellent performance, customization flexibility, transparency and security. IBM is developing a series of new technologies to promote the development of enterprise-grade AI: from models, AI assistants, to the tools needed to optimize and deploy AI for enterprise-specific data and applications. IBM is actively developing AI Agent technology to enable it to self-guide, review and correct, and perform complex tasks in dynamically changing business environments. IBM continues to develop the capabilities of its AI Assistants series. For example, WatsonX Orchestrate helps enterprises use low-code tools and automation to build AI assistants that are pre-trained for specific tasks or areas, such as answering daily questions from customers or employees, supporting modern engineering of mainframes and traditional IT applications, guiding young students to explore possible career paths, or providing online mortgage consultations to homebuyers. The “IBM AskHR” AI assistant used by 300,000 IBM employees worldwide was developed using WatsonX Orchestrate. IBM also announced the next generation of Watsonx Code Assistant (WCA) in late October this year. The new version is supported by the Granite code model and can provide general code development assistance for languages ​​such as C, C++, Go, Java and Python, and provide advanced application modernization capabilities for enterprise Java applications. Granite’s code development assistance function is now also available through IBM Granite.Code (an extension of Visual Studio Code). IBM plans to continue to release new tools to help developers use Watsonx.ai to design, customize and deploy AI more efficiently, including AI agent frameworks, integration capabilities with existing environments, and enhanced support for common application scenarios (such as RAG and Agents). IBM is working on developing AI Agents technology with higher autonomy, complex reasoning capabilities, and multi-step problem solving and tool invocation. The first version of the Granite 3.0 8B model supports major AI Agent features, such as high-level reasoning and highly structured chat templates and instruction forms required to build workflows for tool invocation. IBM also plans to add new AI Agent chat features to IBM WatsonX Orchestrate in the first quarter of 2025, allowing AI Agents to “coordinate” AI assistants, skills, and automation, allowing enterprises to effectively improve the overall productivity of the organization. IBM will continue to enhance the capabilities of AI Agents in its product portfolio, including pre-trained Agents for specific domains and application scenarios. The IBM Granite 3.0 series of models released by IBM recently include: General/language models: Granite 3.0 8B Instruct, Granite 3.0 2B Instruct, Granite 3.0 8B Base, Granite 3.0 2B Base Guardian and security models: Granite Guardian 3.0 8B, Granite Guardian 3.0 2B Expert hybrid models: Granite 3.0 3B-A800M Instruct, Granite 3.0 1B-A400M Instruct, Granite 3.0 3B-A800M Base, Granite 3.0 1B-A400M Base Main features of the IBM Granite 3.0 series of models: Suitability: Many large language models (LLMs) are based on publicly available training and do not contain data with intellectual property rights or internal enterprise data. Granite 3.0 8B and 2B are designed as workhorse models for enterprise-grade AI, providing powerful performance for enterprise tasks such as retrieval augmentation generation (RAG), classification, summarization, entity extraction, tool usage, etc. These compact, versatile models can be fine-tuned based on enterprise data and seamlessly integrated with business scenarios or workflows. Performance: In HuggingFace’s OpenLLM ranking test, the overall performance of the Granite 3.0 8B Instruct model averaged the best performance of similar-sized open source models from Meta and Mistral. In IBM’s AttaQ security test, the above models outperformed Meta and Mistral’s models in all tested security dimensions. In this release, there is also the Mixture of Experts (MOE) Granite 3.0 1B-A400M and Granite 3.0 3B-A800M, which is a professional scheduling technology that can dynamically select the best expert model for reasoning based on the input content, improve efficiency and reduce computing resource requirements. It is particularly suitable for low-latency applications with high requirements for response speed, and takes into account the perfect balance between performance and inference cost. IBM also released an updated version of the pre-trained Granite time series model. These new models are based on three times more data training, have higher modeling flexibility, support external variables and rolling predictions. In the three major time series model evaluations, Granite’s performance exceeded models ten times larger than those of Google, Alibaba, etc. Cost: Enable small Granite models on specific tasks, with your own data, and use IBM and RedHat’s revolutionary alignment technology InstructLab launched in May this year (2024) to help companies train their own models in an efficient and low-cost way. (According to cost analysis results of several early proof-of-concept projects, the cost reduction is about 3 to 23 times) Transparency: The Granite 3.0 technical report and responsible use guide both describe in detail the data sets used to train these models, the data filtering, cleaning and processing steps used, and list their performance results in major academic and industry benchmarks. Legal protection: IBM provides intellectual property rights compensation for all Granite models on the watsonx.ai platform to strengthen the confidence of corporate customers in adopting this model. Security: IBM has launched a new Granite Guardian model series. Application developers can build “safety fences” by checking user prompts and LLM responses to detect various risks in advance. Granite Guardian 3.0 8B and 2B models provide the most complete risk and danger detection capabilities on the market; they can also be used with any other open or dedicated AI models to strengthen AI safety protection mechanisms. Responsibility: In addition to AI hazard indicators such as bias, hatred, swearing, profanity, violence, and attempts to break restrictions, the Granite Guardian model also provides unique RAG-specific checks, such as whether it is based on facts, relevance to context, and relevance to answers. In a comprehensive evaluation of 19 security and RAG standards, the Granite Guardian 3.0 8B model has an overall accuracy of hazard detection that is on average better than the three existing versions of Meta’s Llama Guard model; its overall performance in hallucination detection is also comparable to the models WeCheck and MiniCheck that are specifically used for hallucination detection. Inclusiveness: The Granite 3.0 model is trained using more than 12 trillion tokens of data; the data comes from 12 different natural languages ​​and 116 different programming languages, using a new two-stage training method, and citing thousands of experimental results to optimize data quality, data selection, and training parameters. It is expected that by the end of this year (2024), Granite 3.0 8B and 2B models will support expansion to 128K context length and multimodal models, which can not only process long texts, but also analyze complex documents containing text and images. Openness: The entire Granite 3.0 model suite and updated time series models can be downloaded on HuggingFace under the permissive Apache 2.0 license. The new Granite 3.0 8B and 2B language model instruction variables and Granite Guardian 3.0 8B and 2B models are commercially available on the IBM WatsonX platform. Some Granite 3.0 models will also be provided as NVIDIA NIM microservices and through the integration of Google Cloud’s Vertex AI Model Garden and HuggingFace. To provide developers with choice and ease of use, and to support local deployment and edge applications, select Granite 3.0 models are also available on Ollama and Replicate. The new generation of Granite models expands IBM’s strong catalog of open source LLMs: IBM is working with partners such as AWS, Docker, Domo, Qualcomm Technologies Inc. (through Qualcomm® AI Hub), Salesforce, SAP, and others to integrate multiple Granite models into their products or platforms. Empowerment: IBM also announced that Granite 3.0 is the default AI model on the IBM Consulting Advantage AI empowerment service platform. 160,000 IBM consultants around the world can easily and conveniently apply the Granite model in various client application scenarios, such as customer service or IT modernization, to provide business value to clients in a more agile, efficient and economical manner.