News Center

It's all about application! DataCanvas large model series achievements are released!

2023.11.21

1.png

On November 21, the DataCanvas large model series achievements conference (hereinafter referred to as the "conference") of "Building the Foundation, Empowering the Smart Future" was held in Beijing. The release of the achievements was only four months since the release of DataCanvas Alaya large model matrix on June 30 this year, which is a proof of the strength of DataCanvas large model technology application and innovation capability.

At the press conference, a series of large model innovations aimed at applications, such as the open-source low-level LLMOps large model tool-chain around DataCanvas Alaya large model, the open-source Alaya-7B large model series, and the TableAgent data analysis agent, were released. Its industry-leading model performance, zero-threshold, and high-value data analysis applications once again refreshed the cognitive ceiling of the large model application industry!

At the same time of the conference, Guangdong Private Investment Co., Ltd., Tibet Saifuheyin Investment Co., Ltd., Guangzhou Saifuheyin Asset Management Co., Ltd. and DataCanvas Ltd.  officially signed the Strategic Cooperation on Jointly Promoting the Popularization and Renewal of Artificial Intelligence in the Industrial Field and the Strategic Cooperation on Ecological Expansion of Intelligent Computing Center. In addition, DataCanvas and Hanbo Semiconductor signed the Agreement on Ecological Strategic Cooperation for Co building Intelligent Computing Clusters, which will promote the integration and innovation of AI software and hardware, the large-scale application of AI technology in the industrial field, and build a national leading and internationally influential AI innovation application demonstration zone.

Alaya-7B general&chat large model+LLMOps tool-chain, the large model "whole family bucket" was officially open-source

At the press conference, the DataCanvas Alaya large model matrix independently developed by DataCanvas officially opened a series of new achievements, including the Alaya-7B Foundation Model and Alaya-7B Chat Model in the "Alaya-7B large model series", as well as the LMS model running tool and LMPM prompt manager in the "LLMOps large model tool chain".

2.png

DataCanvas Alaya-7B series large models officially open source

Alaya-7B with lower technical threshold and less computational power consumption

Dr. Yu Jiangang, Vice President of DataCanvas, introduced that Alaya-7B large model series is one of the members of DataCanvas Alaya large model matrix. Based on the Alaya general model, it is pre-trained from 0 on a trillion-token dataset (including Chinese and English articles, news, encyclopedias and other data sources on the network) that is self-collected, carefully screened and processed. Alaya-7B has shown the leading level in the industry in the evaluation list of C-Eval, CMMLU, AGIEval, MMLU, BBH and other authoritative large models previously participated.

Alaya-7B Chat Model is the chat version of Alaya-7B Foundation Model. Through fine-tuning on carefully selected fine-tuning data sets, and detoxification of data based on drug, pornography and bad prejudice, a large dialogue model aligned with human values is generated. Alaya-7B Chat Model has the ability of multiple rounds of dialogue, self-awareness and prejudice rejection, and can complete multiple language tasks such as knowledge question and answer, coding, information extraction, reading comprehension, creative writing, etc.

Dr. Yu said that while ensuring the performance of the model, the Alaya-7B large model series has lower requirements for users' installation and use of hardware, lower difficulty in application technology, and less consumption of computing resources required for training, which helps to accelerate the practical application of the large model in various industrial scenarios.

3.png

DataCanvas LLMOps large model tool-chain officially open source

Life-cycle large model tool chain

The LLMOps toolchain was born for training and using large models, covering the entire lifecycle process of training, fine-tuning, compression, deployment, inference, and monitoring of large models. It provides a complete set of tools for data scientists and application developers to easily process data and use this data to develop, train, and deploy models of any size.

LMS- Large Model Serving, is mainly aimed at engineering and technical developer, aiming to help engineers achieve the delivery and operation of large models, improve the delivery speed and quality of large models, reduce the operation and maintenance costs of large models, and meet the needs of large-scale model production and service operation.

LMPM- Large Model Prompt Manager,is a tool for designing and constructing large model prompts, guiding users to design better prompts and generate more accurate, reliable, and expected output content. This tool can provide both development toolkit for technical personnel and human-machine interaction mode for non-technical personnel, meeting the needs of different groups of people using large models.

DataCanvas is constantly trying to integrate and innovate tool-chains, large models and industrial applications. Previously, DingoDB, one of the tool chains, and DataCanvas Alaya large model jointly created a solution for enterprise knowledge steward, enabling enterprises to build a highly automated and intelligent enterprise knowledge base, and accelerating the implementation and application of multimodal models.

From DingoDB multimode vector database, LMS, LMPM, to Alaya-7B general+chat large model series, DataCanvas has provided users with a one-stop, zero-threshold, full-chain open-source tool portfolio from data management to large model application.

Open-source address

Alaya-7B:https://github.com/DataCanvasIO/Alaya

DingoDB:https://github.com/DingoDB

LMS:https://github.com/DataCanvasIO/LMS

LMPM:https://github.com/DataCanvasIO/LMPM

Hit the most commercially valuable application scenarios! TableAgent enables everyone to be a data analyst

This press conference officially released the public beta of TableAgent data analysis agent, which is also an agent innovation based on DataCanvas Alaya large model and LLMOps large model tool chain.

 4.png

TableAgent data analysis agent officially released

Yang Jian, the chief architect of DataCanvas, said that TableAgent evolved on the basis of Alaya's meta knowledge. It is a breakthrough in interactive structured data analysis from 0 to 1, a new way of enterprise data analysis, and enables "everyone is a data analyst" to realize from dream.

TableAgent is an enterprise level data analysis agent that can realize private deployment. It has very strong intention understanding ability, analysis modeling ability and insight. After fully understanding the user's intention, TableAgent can independently use advanced modeling technologies such as statistical science, machine learning, causal inference, etc. to mine value from data, thus providing analytical views and insights to guide action. This heuristic and guided analysis capability can continuously mine the information and value in the data from the depth and breadth to help users complete high-quality analysis work.

 5.png

TableAgent Feature Advantages

At the same time, thanks to its own large model and self-developed T+underlying system, TableAgent can be applicable to all kinds of industries and professions, and can achieve professional fine-tuning in the context of personalized data analysis in any specific field.

Yang Jian pointed out that the current market presents a rich and diverse form of generative AI, and the TableAgent team found that "data analysis" is a further step in the integration of large models and specific businesses, is the core field that can most generate direct business value for users, and will also be the field that enterprises really need to precipitate and have the most business value. Consistent with DataCanvas' AI foundation software development goal of "all for application", TableAgent focuses on data analysis. As a product of Data+AI, it will transform huge business value for enterprises and usher in incalculable blue ocean opportunities in the future AI era dominated by large models.

TableAgent public beta address:https://tableagent.DataCanvas.com

The integrated service of "algorithm+computing power" is advancing at full speed

At the press conference, DataCanvas officially signed three strategic cooperation agreements with upstream and downstream ecological partners, covering independent innovation of AI software and hardware, large-scale application of AI technology in industrial field and construction of domestic intelligent computing power, to achieve the company's "algorithm+computing power" integrated service ecological strategy!

Based on the industry leading AI technology accumulation and productization strength, DataCanvas and Hanbo Semiconductor signed the Ecological Strategic Cooperation Agreement for Co building Intelligent Computing Clusters. The two sides will make full use of their respective advantages to create a full functional new intelligent computing cluster covering hardware, software and providing full stack capabilities from bottom computing power to top application enabling by large-scale use of excellent domestic software and hardware products.

 6.png

DataCanvas and Tibet Saifu signed the Strategic Cooperation on Jointly Promoting the Popularization and Updating of Artificial Intelligence in the Industrial Field

With the leading edge and rich industry experience in AIFS including large model series tools, Tibet Saifu officially signed the Strategic Cooperation on Jointly Promoting the Popularization and Updating of Artificial Intelligence in the Industrial Field with DataCanvas, aiming to jointly actively promote the latter's AI system solutions in advanced manufacturing and hydropower, thermal power, wind power, energy storage, mining Industrial application in petrochemical industry and other industrial fields, promote the popularization and updating of artificial intelligence technology in domestic industrial fields.

7.png

DataCanvas signed the Strategic Cooperation on Ecological Expansion of Intelligent Computing Center with GDMI and Saifu Asset Management

At the same time, GDMI, Saifu Asset Management and DataCanvas jointly signed the Strategic Cooperation on Ecological Expansion of Intelligent Computing Center, adding strong impetus to the upgrading of the company's integrated service strategy of "algorithm+computing power". According to the strategic cooperation agreement, all parties will cooperate to jointly promote the building of a new type of intelligent computing center that is leading in scale and self controllable in China, promote industrial integration innovation and cooperative ecological construction plans, use the advanced software and hardware technologies and partner resources that all parties have in the field of artificial intelligence, empower the real industry, promote China's innovative application of artificial intelligence industries such as general large-scale models and vertical large-scale models, and promote industrialization transformation, Create a model intelligent computing center demonstration project with benchmarking significance.

8.png

Dr. Fang Lei, Chairman of DataCanvas, delivered a speech

Dr. Fang Lei, chairman of DataCanvas, made a keynote speech on the development of domestic AI computing power in the new era at the press conference. He pointed out that in the first year of the large model, the company grasped the times, insisted on leading, and co evolved with the large model. The development of the large model not only provides fertile ground for the deeper and wider application of AI technology in various industries, but also brings unprecedented technological boost and market opportunities for the innovative integration of AI foundation software and hardware. Among them, the localization of AI computing power is a new field of mutual empowerment of China's artificial intelligence software and hardware, which is promising. Working hand in hand with high-quality ecological partners will accelerate the realization of win-win between the times and me.

This large model series achievements conference further established DataCanvas as the leader of AI foundation software. In the future, the company will anchor AI applications, deepen the ecological cooperation of AI industry chain by virtue of the advantages of AIFS and DataPilot, and accelerate the continuous injection of independent and innovative AI foundation software power into AI Foundation Service!