Cerebras-GPT

A Family of Open, Compute-efficient, Large Language Models
 A Family of Open, Compute-efficient, Large
Product Information
This tool is verified because it is either an established company, has good social media presence or a distinctive use case
Release date22 June, 2023
PlatformDesktop

Cerebras-GPT Features

Cerebras open-sourced seven GPT-3 models from 111 million to 13 billion parameters. Trained using the Chinchilla formula, these models set new benchmarks for accuracy and compute efficiency. Artificial intelligence has the potential to transform the world economy, but its access is increasingly gated. The latest large language model – OpenAI’s GPT4 – was released with no information on its model architecture, training data, training hardware, or hyperparameters. Companies are increasingly building large models using closed datasets and offering model outputs only via API access. For LLMs to be an open and accessible technology, we believe it’s important to have access to state-of-the-art models that are open, reproducible, and royalty free for both research and commercial applications. To that end, Cerebras trained a family of transformer models using the latest techniques and open datasets called Cerebras-GPT. These models are the first family of GPT models trained using the Chinchilla formula and released via the Apache 2.0 license. Cerebras trains the GPT-3 architecture using the optimal compute schedule implied by Chinchilla, and the optimal scaling indicated by μ-parameterization. This outperforms existing GPT-3 clones by a wide margin and represents the first confirmed use of μ-parameterization “in the wild”. These models are trained from scratch, meaning the community no longer depends on LLaMA.
Browse AI Tools Similar to Cerebras-GPT

Trends prompts: