Cerebras Systems, the pioneer in accelerating generative AI, and Petuum, the generative AI company focused on building transparent LLMs, in partnership with MBZUAI today launched CrystalCoder, a new 7 billion parameter model designed for English language and coding tasks. While previous models were suitable for either English or coding, CrystalCoder achieves high accuracy for both tasks simultaneously. Trained on Condor Galaxy 1, the AI supercomputer built by G42 and Cerebras, CrystalCoder-7B has been released under the new reproducible LLM360 methodology that promotes open source and transparent, responsible use. CrystalCoder and the LLM360 release methodology are available now on Hugging Face.
As open-source models gain parity with closed-source LLMs in terms of performance and accuracy, leading open-source developers are increasingly releasing checkpoints and recipes to help others study LLM training and promote collaboration and education across the research community. The LLM360 methodology, developed by Petuum, MBZUAI and Cerebras, is a novel approach to advancing transparency and safety by open sourcing more of the model ingredients so that work is reproducible by others. In addition to releasing weights under the Apache 2.0 license, the model recipe, and a paper, the LLM360 methodology open sources the training code, up to 360 training checkpoints, pre-processing scripts, data buckets, and analytics tools:
- Model: Releases 360 checkpoints across the training run
- Data: Provides access to the data buckets for each checkpoint
- Code: Provides pre-processing, training, inference code, and analysis code (if applicable)
- Metrics: All the training logs, evaluations and analysis results collected during training are publicly disclosed, also in correspondence to the training steps and data sequence
“Cerebras is proud to be the inaugural hardware partner for LLM360 and, in partnership with Petuum, to release the first model under this methodology, CrystalCoder-7B,” said Andrew Feldman, CEO and co-founder, Cerebras Systems. “We believe that transparency and reproducibility matter as much as model quality for the safe advancement of AI. We look forward to seeing more models released to the open source in this manner.”
In coding tasks, CrystalCoder approaches StarCoder-base in accuracy while in language it is comparable to Llama and MPT-7B. The significance of this model is that it is optimal for both coding and language tasks, better at coding than the best language models, and better at language than the best coding models. While previously developers had to choose between coding or language, Crystal Code is optimal for both of these tasks.
“Petuum and MBZUAI are excited to announce the release of the CrystalCoder-7B large language model (LLM). This groundbreaking collaboration, strengthened by our partnership with Cerebras, marks a significant milestone in the field of advanced open-source LLMs. CrystalCoder stands out due to its meticulously balanced and carefully selected data sets, unparalleled performance and reliability on language and code tasks,” said Hector Liu, Head of Engineering at Petuum and LLM Team Lead at MBZUAI. “Additionally, the unique design of Cerebras’s Condor Galaxy 1 transforms previously daunting large-scale training challenges into manageable tasks, setting new standards for efficiency and effectiveness in LLM training.”
CrystalCoder-7B is the latest in a family of leading open-source models co-developed by Cerebras, including Jais 13B and Jais30B, the best bilingual Arabic models in the world created in partnership with Core42, now available on Azure Cloud. In June, Cerebras released BTLM-3B-8K, which is the number one leading 3B model on HuggingFace, offering 7B parameter performance in a light 3B parameter model for inference. Med42, developed with M42 and Core42, is a leading clinical LLM, trained on Condor Galaxy 1 in a weekend and surpassing MedPaLM on performance and accuracy. In March, Cerebras released the first open-source family of GPT models, named Cerebras-GPT, followed by the release of the SlimPajama dataset, the best LLM dataset with highest training efficiency.
For more information on CrystalCoder-7B and LLM360 Methodology, please visit llm360.ai.
About Cerebras Systems
Cerebras Systems is a team of pioneering computer architects, computer scientists, deep learning researchers, and engineers of all types. We have come together to build a new class of computer system, designed for the singular purpose of accelerating generative AI. Our flagship product, the CS-2 system, powered by the world’s largest and fastest AI processor, makes training large models simple and easy, by avoiding the complexity of distributed computing. Cerebras solutions are available in the cloud, through the Cerebras AI Model Studio or on premises. For further information, visit https://www.cerebras.net.
View source version on businesswire.com: https://www.businesswire.com/news/home/20231211668232/en/