Seven Large Language Models for Generative AI Specially Trained on Cerebras Hardware Released
Cerebras Systems, a chipmaker that specializes in artificial intelligence has announced this week that they’ve released seven large language models for generative AI. They also made these new models available to the wider research community. According to their announcement, these GPT-like large language models were first trained using their CS-2 systems in the Cerebras Andromeda Al Supercluster. What makes this unique is that they are powered by their WSE-2 chip which is specially designed to run AI software.
This means that these seven LLMs are among the first to be trained without the reliance on using graphic processing unit-based systems. Using a standard Apache 2.0 license, the team stated that they will be both sharing the models and the training methods used. Until now, there have been no alternatives to Nvidia Corp‘s GPUs, but the technology has helped to fuel a sort of arms race among AI hardware makers seeking to create chips that are more powerful and uniquely suited for AI-specific needs.
Their WSE-2 chip, which is the heart of the Cerebras Andromeda Supercomputer is boasting more than 13.5 million processor cores and is optimized to run AI applications. Backed by more than $720 million in venture funding, the Sunnyvale, California-based started up’s chip is a promising start for those wishing to be release from the constraints of GPU processing.
This is important because with traditional LLM training on GPUs, a complex amalgam of pipeline, model, and data parallelism techniques is required which takes considerable resources and labor hours. But now with, Cerebras’ weight streaming architecture the same projects can be done with a simpler, data-parallel-only model that requires no code or model modification to scale to very large models.
Cerebra’s announcement comes with its open-sourcing a series of seven GPT models with 111 million, 256 million, 590 million, 1.3 billion, 2.7 billion, 6.7 billion, and 13 billion parameters. These are available both on GitHub and Hugging Face. Before the availability of their Cerebras CS-2 systems in Andromeda, training these models would take months of work. Now, according to the company, training has been reduced to a couple of weeks.
With these seven LLMs being open-sourced, requiring lower training time, lower training costs, and finally requiring less energy than current LLMs available in the market, Cerebras Systems seems to want to open the door for those wishing to build powerful generative AI applications with minimal effort.
Originally posted on OpenDataScience.com
Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.